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EDITORIAL 


During the past two years, educa- 
tion has “come alive” in ways that few 
would. have predicted ten years ago. 
The tremendous upsurge of support 
for research and development, large- 
scale demonstrations under field con- 
ditions, the new reform movement in 
eurrieulum and educational technol- 
ogy, the advent of the computer and 
associated hardware will have a pro- 
found impact upon the field of educa- 
tional psychology. In a similar man- 
ner, developmental psychology has 
been swept up in the vigorous new 
educational movement, resulting in 
greatly increased research and writing. 
Within the growing number of jour- 
nals sponsored by the American Psy- 
chological Association, the Journal of 
Educational Psychology is emerging 
as the primary outlet for studies and 
theoretical articles in the closely re- 
lated fields of educational and devel- 
opmental psychology. 

A relative newcomer to the group 
of journals published by APA, the 
Journal has grown appreciably in 
stature under the able leadership of 
Raymond Kuhlen, Editor during the 
first nine years of APA ownership. 
Kuhlen retires as Editor with the pub- 
lication of this issue. Because of the 
usual publication lag, manuscripts 
submitted after May 1, 1966, will 
begin to appear in the April issue 
under the new Editor. 

Although my term as Editor did not 
officially begin until January 1, 1967, 
responsibility for review of manu- 
scripts and editorial policy was as- 
sumed in July in order to plan ahead 
for issues in 1967. During the four 
months from July to October, 124 
manuscripts were received and are 


now in various stages of processing. A 
sufficient number have been reviewed 
to permit some general impressions 
that may be helpful to authors. As in 
the past, only a small percentage of 
manuscripts are accepted, and many 
of these require some revision. A few 
manuscripts of good quality are re- 
jected because they deal with topics 
outside the purview of the Journal. 
Much good educational research is of 
only minor psychological significance 
and is better suited for publication 
elsewhere. A major consideration in 
such decisions is the degree to which 
the problem is conceptualized in psy- 
chological terms and measures of 
psychological variables are employed. 

Most rejected manuscripts are 
judged to be insufficient in quality or 
significance rather than outside the 
field. Serious defects in the formula- 
tion of the problem, the research de- 
sign, the analysis or interpretation of 
results, or the general organization 
and presentation of a study are fre- 
quently uncovered by advisory editors 
to whom manuscripts are sent for 
critical review. Where the basic study 
is sound and the results potentially 
significant, detailed comments are sent 
to the author, who is encouraged to re- 
examine his data and revise his manu- 
script. In many cases, however, the 
shortcomings are too serious for re- 
vision to be of appreciable value. 
Other studies may be soundly executed 
and presented but are of minor im- 
portance. While some of these may be 
published as brief articles or notes, 
most simply fail to meet sufficient 
standards to justify space under pres- 
ent competition. 


Preference will be given to contri- 
butions which have some relevance to 
education although such relevance is 
often hard to determine. Special ef- 
forts will be made to improve the 


Wayne H. Hourzman 


range and quality of contributioi ns 
developmental as well as educatio 
psychology. Theoretical articles. 


especially welcome. 
Wayne H. HOLTZMAN, 


rnal of Educational Psychology 
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EFFECTS OF INCIDENTAL MATERIAL IN A 
PROGRAMMED RUSSIAN VOCABULARY 
LESSON 


GERALD W. FAUST aw» RICHARD C. ANDERSON 
University of Illinois 


In each of 2 experiments, 2 versions of a stylized program to teach 
Russian vocabulary were given to college students. 1 version used 
only primitive “copying frames,” while the other version used copying 
frames in which the prompt sentence was embedded in a context of 
5 English-Russian sentences. In both experiments the Context groups 
recalled significantly more (p < .05) overtly-practiced Russian words 
on the posttests than did the No-Context groups. The advantage of 
the Context programs was especially pronounced for Ss who hurried 
through the programs, The results can be interpreted as demonstrating 
that the addition of incidental material to the copying frame facilitates 
associative learning by insuring that Ss at least notice the stimulus 


before making the response. 


The recent literature on pro- 
grammed instruction has stressed the 
importance of getting the student to 
make the correct response for the right 
reasons. Holland (1965a) has ex- 
pressed the point aptly, stating that 
“the answer required of the subject 
should be one he can give if, and only 
if, appropriate precursory behavior 
has occurred [p. 78].” “Appropriate 
precursory behavior” may include 
reading, examining a chart, or solving 
a problem. At a minimum, “appro- 
priate precursory behavior” consists 
of noticing the stimulus that it is 
hoped will control the response in the 
future. 

There is some evidence to support 
the principle that program effective- 
ness depends upon designing frames 
so that students cannot answer cor- 
rectly unless they engage in the de- 
sired precursory behavior. Holland 
(1965b) and Krumboltz (1964) com- 
pared standard programs with pro- 
grams modified so that trivial re- 
sponses were required of subjects (Ss). 
In each case Ss who received the 
standard program did substantially 
better on the posttest than those who 
received the trivial-response program. 


Maybe the trivial-response programs 
were poorer because Ss no longer had 
to engage in “appropriate precursory 
behavior” in order to respond cor- 
rectly. But there is another explana- 
tion of these data. In the trivial-re- 
sponse programs Ss did not so often 
emit the important responses which 
they later would have to emit to earn 
a good score on the posttest. It is well 
known that making overt responses 
facilitates response-learning (Mc- 
Geoch & Irion, 1952, p. 509), particu- 
larly when response m is low, such as 
is the case with unfamiliar, technical . 
terms (Williams, 1965), foreign lan- 
guage vocabulary (Fry, 1960), or 
trigrams (Eigen & Margulies, 1963). 
It is quite possible that the trivial- 
word programs were inferior to the 
unaltered versions because in the al- 
tered programs Ss emitted important 
responses less often. Research that 
manipulates the likelihood of appro- 
priate precursory behavior, while 
holding constant the amount of overt 
practice of important responses, is 
necessary to prove that program effec- 
tiveness depends upon insuring that 
the response is contingent upon ap- 
propriate precursory behavior. 
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Accompanying the argument that 
the response should be made contin- 
gent upon precursory behavior, such 
88 reading relevant material, there is 
a second argument, treated as though 
it were a corollary: “Incidental ma- 
terial,” that is, material upon which 
the response is not contingent, has no 
place in the self-instructional program. 
A high proportion of incidental ma- 
terial is claimed to be a sure sign of a 
bad program. The implication is that 
every shred of material irrelevant to 
the response in the frame in which the 
material appears should be excised. 

The argument against incidental 
material does not follow logically from 
the principle that the response should 
be contingent upon appropriate pre- 
cursory behavior. Provided a frame 
contains at least some relevant ma- 
terial upon which the response does 
depend, it would, as far as this prin- 
ciple is concerned, seem to be a matter 
of indifference as to whether the frame 
also contained incidental material. 

While it cannot be assumed that 
much will be learned from the inci- 
dental material itself (Eigen & Mar- 
gulies, 1963), there are several reasons 
for supposing that, under certain cir- 
cumstances, the presence of incidental 
material might facilitate learning from 
the material which does entail overt 
responding. First, there is the obvious 
point that suitable incidental material 
could make a program more interest- 
ing, and perhaps as a result keep the 
student working at the program such 
that he comes under the control of the 
contingencies which it embodies. Sec- 
ond, incidental material might force 
stimulus discrimination and response 
differentiation not required by a 
“stripped down” version of a program. 
The “blackout procedure” (Holland & 
Kemp, 1965), or a similar technique, 
could probably be applied to a pro- 
gram so as to make correct responding 
easier on some frames, easier because 


fine discriminations were no longer 1 
quired. As a result of removing i 
dental material, students might pro 
unable to make certain discrimi 
tions on the posttest. Third, speakin 
more speculatively, incidental m 
terial might serve as “filler” so as 
give a distributed practice effect or 
spaced review effect. Fourth, 
some conditions, incidental materi 
might increase the likelihood that t 
student will notice or pay attention 
the stimulus before making a resp 
The chief purpose of the two exp 
ments described in the next secti 
was to demonstrate that inciden 
material can facilitate learning fro 
material upon which overt respons 
are conditional by requiring Ss to m 
tice the stimulus before making í 
response, Two versions of styli 
programs to teach Russian vocab 
were prepared. One program employ 
primitive “copying frames" (Mark 
1964), consisting of a prompt sente 
with an English subject and its Ri 
sian equivalent as the predicate nom 
native, and the same sentence in 
mediately below with the R 
word replaced by a blank. The cop 
frame assures correct responses bul 
permits the student to complete 
blank correctly without attendin 
that part of the stimulus object w 
it is hoped will control the respons 
the future. The S who is given a se 
of copying frames, like the one $ 
in Figure 1, could simply copy 
Russian word (*stohl") in the bi 
without ever reading the com) 
sentences. Such behavior may lead: 
good response learning but it will 
tainly not lead to any hookup bet; 
the English stimulus (“table”) and th 
Russian response (“stohl”). In othe 
words, the obvious copying frame does 
not guarantee that the English word 
will become a discriminative stimulus 
for its Russian equivalent. 
The second program was identical 


INCIDENTAL MATERIAL IN LEARNING 5 


COPYING FRAME 


COPYING FRAME 
WITH CONTEXT 


(SEARCH rx 


A table is a. stohl. 


i 


A table is a ——. 


A rag is a tryapka. A bridge 


— e TE 


is a mohst. A table is a stohl. 


A college 


Lis a vooz. An onion is a look. 


A table is a 


Fio. 1. Two versions of the copying frame. (The lines trace the minimal eye movements 
that can be used to correctly fill in the response blank.) 


to the first except that the prompt 
sentence was embedded in a para- 
graph with four other English-Russian 
sentences. Figure 1 illustrates the eye 
movements (a potentially observable 
aspect of “attention” or inspection be- 
havior) necessary to complete a blank 

. when incidental material is added to 
a copying frame. Notice that S must 
find the stimulus term and discrimi- 
nate it from the surrounding material 
in order to fill in the blank. The stu- 
dent might not use the inspection tech- 
nique diagramed in Figure 1. He might 
read all of the sentences in the para- 
graph before making a response, or 
he might indulge in some covert re- 
hearsal, and so on. However, the addi- 
tion of incidental material guarantees 
that he will engage in at least mini- 
mumly-adequate inspection behavior. 
Such frames are “failsafe” in the sense 
that S cannot simply copy a response 
without giving any attention to the 
stimulus. The prediction is that, de- 
spite possible interference from the in- 
cidental material, the addition of inci- 
dental material to the copying frame 
will result in a net increment in the 
learning of overtly-practiced items by 
facilitating S-R hookup. 


METHOD 


Experiment I 


Sample. Forty-eight graduate students en- 
rolled in an educational psychology course 
volunteered to act as Ss for this experiment, 
Students who had any knowledge of Russian 
were excluded from the sample. 

Procedure. Each of the two groups re- 
ceived a training sequence entailing 10 
presentations of 12 English-Russian word 
pairs, a posttest, new word pairs presented 
by the anticipation method and, finally, a 
retention test on the initially-learned words. 

The Russian words used in the programs 
were selected from a list of 212 four- to 
seven-letter Russian words. Pronounciabil- 
ity ratings (PR) were obtained for each of 
these words from 34 summer institute par- 
ticipants, using the procedure described by 
Underwood and Schultz (1960). Two 12-word 
lists (List A and List B) were chosen from 
the 212 rated words such that each list con- 
tained six easy words (PR < 3.00) and six 
hard words (PR > 6.00). Each of these lists 
was further constrained to contain: (a) only 
words with common English meanings, (b) 
no two Russian words or English words with 
the same initial letter, (c) no two words with 
very similar phonetic or orthographic con- 
struction, (d) no Russian word with an ob- 
viously strong association to its English 
equivalent. 

Each frame in the No-Context program 
consisted of two sentences. The first was 
the prompt sentence, containing an English 
subject and its Russian equivalent as the 
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predicate nominative. The second sentence 
was the same as the first except that the 
Russian word was replaced with a blank. 
This arrangement is pictured in Figure 1. 
Half of the Ss receiving the No-Context 
program were taught List A words. The 
remainder were taught List B words. The or- 
der of appearance of words within trials was 
randomized independently for each trial, 
and there were two completely different 
randomizations overall. 

The Context program was identical to the 
No-Context program except that the prompt 
sentence was embedded in a paragraph with 
four other English-to-Russian sentences as 
is depicted in Figure 1. Over the 10 training 
trials the prompt sentence for a given word 
pair appeared twice in each of the five 
ordinal positions within the paragraph. Half 
of the Ss who received the Context program 
overtly practiced List A words and received 
List B words as part of the incidental ma- 
terial. The assignment was reversed for the 
rest of the Ss. Each incidental word on the 
unpracticed list appeared once per train- 
ing trial, and over the 10 training trials the 
sentence containing any incidental word ap- 
peared twice at each ordinal position within 
the paragraph of five sentences, The remain- 
ing three English-Russian sentences can be 
called filler sentences to distinguish them 
from incidental sentences. The incidental 
sentences and filler sentences together com- 
prised the context. Filler words were not 
systematically chosen. The order of frames 
was randomly determined for each trial in- 
dependently of other trials. There were two 
completely different randomizations of the 
context materials. 

The posttest and the retention test con- 
sisted of frames containing a sentence with 
an English subject and a blank in place of 
the Russian predicate nominative. Each test 
required recall of both overtly-practiced 
words and incidental words. Between the 
two tests all Ss received 10 anticipation trials 
on the incidental words, A strict scoring 
procedure, in which a Russian word was 
counted correct only if it was spelled cor- 
rectly, was employed. A more relaxed scor- 
ing procedure was tried and then dropped 
since it did not appear that method of scor- 
ing interacted with other variables. 

The eight sets of experimental materials 
(Context versus No-Context X Lists X 
Randomizations) were typed on buff-colored 
IBM cards and then placed in three-ring 
notebooks. Groups of between four and 
eight Ss completed the experiment at one 
time. Prior to each session, the materials 


were stacked in a random order. After the 
Ss were seated, the materials were taken 
from this stack and passed out clockwise 


around the table. The Ss read the instruc- . 


tions which came with their programs and 


then were instructed by the experimenter on | 


how to record the time both before and 
after each trial. All Ss were seated facing 


a wall clock from which they were able to | 


record the time in minutes and seconds. The 
time record and all responses were written 


on separate answer sheets which were col- | 


lected before the test trials. An experimenter 
was always present during sessions. 


Experiment II 


Sample. Fifty-four undergraduates en- 
rolled in an educational psychology class 
acted as Ss to fulfill a course requirement, 
Once again Ss who were familiar with Rus- 
sian were excluded from the sample. 

Procedure. The two program versions used 
in this experiment were basically the same 
as those used in Experiment I. However, 
only one list of words received overt prac- 
tice. The Ss were not tested for recall of 
incidental words. No retention test was 
given. The S received a posttest consisting 
of two test trials. In order to allow more 
chance for differentiation between program 
groups, lists of 16 words (8 hard words and 
8 easy words) instead of the original 12 were 
used. In this experiment the PR of hard 
words ranged from 3.00 to 7.00. Because of 
the constraints imposed on word lists (see 
Experiment I) only 11 words from the origi- 
nal lists were used. The program format was 
changed slightly in that training and test 
frames were mimeographed and placed in 
consumable booklets. The Ss now wrote their 
responses into the program booklets instead 
of on separate answer sheets, 

When Ss finished the posttest, they com- 
pleted a questionnaire which inquired into 
attitudes toward the experiment and effort 
expended to learn the words. Six open-ended 
essay questions, presented first in the hope 
that Ss would volunteer information, asked 
for comment on the amount of rehearsal or 
self-testing, interest, persistence in attempt- 
ing to learn the words, shortcuts used in fill- 
ing in the blanks, etc. The Ss were then 
asked to complete nine multiple-choice 
questions intended to elicit the same infor- 
mation as the open-ended questions, but 
which were more pointed in wording. 

Tn all other respects, the procedure in Ex- 
EOM II was the same as that in Experi- 
ment I. 
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RESULTS 

A one-tailed ¢ test demonstrated the 
predicted superiority in recall of 
overtly-practiced words for the Con- 
text group in Experiment I (£ = 2.20, 
df = 46, p < .05), though the analysis 
of variance (see Table 1) failed to 
show a significant main-effect differ- 
ence for program. On the posttest and 
retention test combined, the Context 
group recalled a mean of 12.5 overtly- 
practiced words (highest possible 
score = 24) whereas the No-Context 
group recalled a mean of 10.4 words, 
a 20% advantage for the former group. 
This superiority can be traced pri- 
marily to the absence of very low 
scores in the Context group. The eight 
lowest scores were produced by Ss who 
received No-Context programs. Fur- 
thermore, the advantage of the Con- 
text group was due almost entirely 
to better recall of the easy words, as 
is evidenced by the significant inter- 
action between Program X Word Dif- 
fieulty (see Table 1). The Context 
group enjoyed a mean superiority of 
2.1 easy words. There was no differ- 
ence between the program groups in 
hard words recalled. 

Still speaking of Experiment I, as 
had been expected, significantly fewer 
(p < .01) hard words than easy words 
were learned. Pooling over other con- 
ditions a mean of 2.8 hard words and 
a mean of 8.7 easy words were re- 
called. Significantly more (p « .01) 
words were recalled on the posttest 
than the retention test. Retention in- 
terval did not interact with any of the 
other variables investigated. The in- 
teraction of Word Difficulty x List 
was unexpectedly significant (p < 
01), probably because the variability 
in word difficulty happened to be 
greater in List A. 

A mean of .8 words from the inci- 
dental list was recalled on the posttest 
by the Context group, which, of course, 


TABLE 1 


ANALYSIS oF RECALL VARIANCE 
IN EXPERIMENT 1 


Source df MS F 
Between subjects 
Program 1| 13.02 2.36 
List (L) 1 .08 .002 
Program X L 1 15.19 2.74 
Persons within groups 44 5.55 
Within subjects 

Word difficulty (D) 1| 426.00 | 161.30** 
D X program 1 16.35 6.19* 
DXL 1 38.17 14. 16** 
D X program X L 1 .69 .26 
Persons within groups X D| 44 2.04 
Retention Interval (R) 1 35.02 42.19** 
R X program 1 .005 «0006 
RXL 1 a .20 
R X L X program 1 3.02 3.04 
Persons within groups X R| 44 .88 
DXR 1 1.35 2.33 
D X R X program 1 .18 81 
DXRXL 1 1.18 1.94 
DXRX program X L 1 1.02 2.79 
Persons within groups X 

DXR 44 58 

*p <05. 

** p € 01. 


was the only group exposed to these 
words during training. Only five Ss 
recalled any incidental words. For 
these five a mean of 3.8 words was re- 
called and all but one of the incidental 
words recalled was an easy word. It 
should be emphasized that only one- 
fifth of the incidental material was 
tested. Extrapolating, probably an 
average of 4.0 incidental words was 
learned overall, accounted for entirely 
by a small percentage of Ss. Putting 
the matter another way, an S was 
about 15 times as likely to learn an 
overtly-practiced word as he was to 
learn an incidental word. 

Dividing the Ss within each program 
group into quartiles on the basis of 
time taken to complete training, and 
then comparing the program groups 
on the basis of recall in each of these 
quartiles (Figure 2), reveals that the 
recall of the No-Context Ss was an 
increasing function of the amount of 
time spent on the training sequence. 


8 Geratp W. Faust AND RrcHARD C. ANDERSON 


Lm 
u 
so 
/ 
dq / 
z 
a / 
V / 
5 30 / 
a / 
a / 
Cy 4 
20 | 
*—— CONTEXT GROUP 
ol 0-—-—o NO-CONTEXT GROUP 
fend a Ea 
| Ie | | 
1 2 3 4 


TRAINING TIME QUARTILES 


Fic, 2. Percentage of recall as a function 
of training time and program group in Ex- 
periment I, 


This was not true of the Context Ss 
as there were only minor fluctuations 
in recall among their time quartiles. 
Analysis of variance showed that 
while neither the main effect of train- 
ing time (F = 2.53, df = 1/44, p > 
05) nor of program (F = 2.75, df = 
1/44, p > .05) was significant, there 
was a significant interaction of Train- 
ing Time x Program (F = 8.02, df = 
1/44, p < .01). The interaction can be 
attributed to the superior performance 
of the Context Ss in the first time 
quartile. The Context group took 
longer to complete training (t = 3.59, 
df = 46, p < .01) than did the No- 
Context group. 

Turning now to Experiment II, a 
one-tailed ¢ test once again demon- 
strated the predicted superiority of the 
Context group in the recall of overtly- 
practiced words (t = 1.98, df = 52, p 


< .05). On the two test trials the No- 
Context group recalled a mean of 20.0 
words (highest possible score — 32) 
and the Context group recalled a mean 
of 21.7 words. Though the absolute 
difference in words recalled was only 
slightly less than in Experiment I, the 
Context group in Experiment II re- 
called only 9.1% more words than did 
the No-Context group, whereas in Ex- 
periment I the advantage was 20%. 
The percentage of difference was 
smaller in Experiment II because both 
groups learned a higher proportion of 
the words. 

As in Experiment I the superiority 
of the Context group was due primar- 
ily to the absence of low recall scores, 
The Context group accounted for only 
3 of the 14 cases in which 6 or fewer 
words were recalled. 

Once again significantly more (F = 
89.5, df = 1/52, p < .01) easy words 
than hard words were learned. A mean 
of 6.4 easy words and 4.1 hard words 
were recalled. The major difference 
between the groups was in the number 
of easy words recalled. The Context 
group enjoyed a mean advantage of | 
1.5 easy words and an advantage of 
only .2 hard words. 

Figure 3, which compares the mean 
recall per training-time quartile for 
the two program groups, is very simi- 
lar to Figure 2. In both Figure 3 and 
Figure 2 the Context group showed 
considerably greater recall in the first 
time quartile. In Figure 3, unlike Fig- 
ure 2, this superiority was even greater 
in the second quartile. A one-tailed 
test of the interaction between Train- 
ing Time X Program Condition 
showed the expected superiority of the 
Context group in the low time quar- 
tiles (¢ = 2.25, df = 50, p < .05).1 The 
regression lines in Figure 3 show & 
steeper slope than those in Figure 2, à 

* We are indebted to Gene V. Glass, per- 


sonal communication, January, 1965, for de- 
veloping the one-tailed interaction test. 
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fact that can probably be traced to the 
easier hard words" used in Experi- 
ment IT. The Ss who spent more time 
during training were able to make 
greater progress in learning the hard 
words. In other words, there was a 
higher ceiling than in Experiment I. 
The Context group took longer than 
the No-Context group to complete 
training (t = 3.02, df = 52, p < .01). 


Discussion 


These experiments close one hole in 
the argument that the effectiveness of 
self-instructional programs depends 
upon designing frames so that “the 
answer required of the subject [is] one 
he can give if, and only if, appropriate 
precursory behavior has occurred 
[Holland, 1965, p. 78].” Evidence pre- 
viously advanced in support of this 
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principle, namely the studies involving 
trivial-response programs, could be ex- 
plained in terms of a response-learning 
effect. In the present experiments all 
groups got the same amount of overt 
practice of important responses; hence, 
the facilitation can be attributed to 
the features of the Context programs 
that insured appropriate precursory 
behavior. 

The results of the two experiments 
do support the contention that the ad- 
dition of incidental material to the 
copying frame improves the perform- 
ance of some Ss by requiring them to 
notice the discriminative stimulus be- 
fore making the response. It should be 
emphasized that the prediction of su- 
perior recall by the Context groups 
assumes inadequate inspection behav- 
ior by the No-Context groups. Not all 
Ss in the No-Context groups employed 
inadequate inspection behavior. Only 
Ss who attempted to shortcut the 
blank-filling procedure by simply 
copying response terms were eliminat- 
ing the chance for S-R hookup. Thus, 
only these Ss should be expected to fall 
short of their counterparts in the Con- 
text groups. Since Ss in the first and 
second time quartiles were those most 
likely to have been taking shortcuts 
in filling in the blanks, the superior 
performance of the first- and second- 
quartile Ss who used the Context pro- 
grams, and the failure to find differ- 
ence between programs for the third 
and fourth quartiles, is understanda- 
ble. 

The fact that Context groups 
showed an advantage on easy words 
but not on hard words is also under- 
standable. Embedding the prompt sen- 
tence in a paragraph of incidental ma- 
terial literally requires S to “notice” 
the discriminative stimulus. Just no- 
ticing the discriminative stimulus may 
be a sufficient condition for associative 
learning in the case of easy words. Our 
belief is that just noticing the stimulus 
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was not sufficient in the case of hard 
words. For hard words S may need to 
search for mediating associations or 
engage in self-testing and covert re- 
hearsal. There is little reason for be- 
lieving that the Context treatment 
could “buy” these additional precur- 
sory behaviors. 

The questionnaire data from the 
second experiment support the analy- 
sis of inspection behavior that has 
been advanced, particularly the inter- 
pretation of training time. Rated “con- 
scientiousness” correlated .44 with 
training time. When compared to Ss 
with  slower-than-median training 
times, those with faster-than-median 
times (both program groups pooled) 
reported a significantly greater tend- 
ency (t = 2.07, df = 52, p < .05) to 
"just try to fill in the blanks as 
quickly as possible [without spending] 
extra time studying the words." Many 
of those with fast training times ac- 
tually reported using the inspection 
behaviors diagramed in Figure 1. 

The first experiment demonstrated 
once again that it is not safe to as- 
sume that Ss will learn from incidental 
material contained in programs. 

There was some indirect evidence 
that the incidental material interfered 
to some extent with the recall of 
overtly-practiced words. Obviously in- 
terference could occur only for Ss who 
learned or partially learned incidental 
words. Four of the five Ss who recalled 
any incidental words were in the 
fourth training-time quartile. Con- 
sidering now only Ss in the fourth 
quartile, those who received Context 
programs actually recalled fewer 
overtly-practiced hard words than 
comparable Ss who received No-Con- 
text programs. This fact is consistent 
with the notion that interference af- 


fected performance on conscientious Ss 
who received the Context programs, 
since interference has its maximum ef- 
fect upon the least well-learned or 
most difficult items (Underwood & 
Schultz, 1960). It is possible that if it 
were not for this interference conscien- 
tious Ss who received the Context pro- 
grams would have surpassed the con- 
seientious Ss who got the No-Context 
programs. 
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OF PLAUSIBLE MULTIPLE-CHOICE ALTERNATIVES 


R. J. KARRAKER 
University of Missouri at Kansas City 


To test the hypothesis that plausible wrong responses in multiple- 
choice programs are recalled as being correct, 72 college freshmen 
were stratified into high- and low-ability levels, then randomly as- 
signed to No Knowledge of Results, Knowledge of Results, or Control 
treatments. The Ss responded on a multiple-choice test in educational 
psychology, then a recall criterion test was administered. The score was 
the number of responses that were the same as the plausible wrong re- 
sponses in the multiple-choice test. No Knowledge of Results resulted 
in significantly more (p < .01) errors than Knowledge of Results or 
Control. When No Knowledge of Results was given, Ss did recall more 
plausible wrong responses as being correct, but this effect did not ap- 
pear when Knowledge of Results was given. 


Immediate knowledge of results is 
frequently included in lists of depend- 
able principles of learning (Hilgard, 
1956, p. 487; Watson, 1960, p. 254), 
and there is a considerable body of 
evidence to validate this conclusion 
using many types of apparatuses and 
species of organisms (Ammons, 1956). 
However, in applying this principle to 
verbal learning, the issue has been 
found to be less axiomatic. For in- 
stance, two investigators report no 
significant difference attributable to 
knowledge of results (Feldhusen & 
Birt, 1962; Hough & Revsin, 1963), 
and others report knowledge of results 
to be interacting with level of achieve- 
ment and difficulty of item (Goldbeck 
& Campbell, 1962; Gordon, 1966) ; 
motivation (Kight &  Sassenrath, 
1966) ; and duration of the response- 
knowledge of results interval (Evans, 
1960). 

However, Pressey (1950) and Angell 
and Troyer (1948) report unqualified 


1Part of the data was iyi at the 
February, 1966, meeting of the American 
Educational Research CEA Chicago, 
Illinois, Appreciation is expressed to K. D. 
Orton and Clinton I. Chase for consultative 
services during the preparation of this pa- 
per. 
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advantages in utilizing a machine and 
a device called the punchboard to 
provide immediate knowledge of re- 
sults to improve verbal learning. Skin- 
ner also considers knowledge of results 
an extremely important variable in 
the learning process. Skinner (1954, 
p. 91) states, *...the lapse of only a 
few seconds between response and re- 
inforcement (knowledge of successful 
results) destroys most of the effect.” 

Skinner considers it desirable for 
subjects (Ss) to receive knowledge of 
results in order for reinforcement to 
occur. Consequently, Skinner (1958, 
p. 970) suggests that verbal material 
must always be written to “...help 
the student to come up with the right 
answer.” Pressey (1960, p. 501) con- 
siders it desirable that students be ex- 
posed to incorrect alternatives to teach 
frequently occurring misunderstand- 
ings. Skinner (1958, p. 970) states, 
* , , effective multiple-choice material 
must contain plausible wrong re- 
sponses, which are out of place in the 
delieate process of shaping behavior 
because they strengthen unwanted 
forms." 

Kaess and Zeaman (1960) report 
that negative knowledge of results 
(knowledge of unsuccessful results) 
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has a deleterious effect on learning. 
'They administered a multiple-choice 
test, gave knowledge of results, and 
readministered the same test. They 
found that students responded more 
often than chance would suggest to 
their prime distractor, indicating some 
students learned the wrong response 
from exposure to a wrong alternative. 
However, of the responses that were 
incorrect on the initial trial, 53% were 
correct on the second trial. 

Pressey (1950) found that after ad- 
ministration of a multiple-choice test 
with a punchboard, students showed 
a gain in number of correct responses 
on a retake of the same test, on trans- 
fer tests, and on tests requiring a con- 
structed response. However, there was 
no information given in regard to the 
extent to which students responded on 
the transfer and constructed response 
tests with incorrect alternatives re- 
called from the multiple-choice test. 
The latter seems more directly to an- 
swer the question; Do plausible-sound- 
ing wrong alternatives strengthen un- 
wanted forms of behavior? 


Merxop 


Subjects 


The Ss were 72 college freshmen enrolled 
in a class entitled Introduction to Educa- 
tional Psychology. They were divided into 
ability levels on the basis of their “Gamma 
IQ" on the Otis Quick-Scoring Mental Abil- 
ity Test. The mean of the high-ability group 
was 130 and that of the low-ability group 
was 121. The Ss were randomly assigned to 
treatments within their ability levels, 


Treatments 


Knowledge of Results. The Ss were ad- 
ministered a multiple-choice achievement 
test in educational psychology. The follow- 
ing class period, the test was again distrib- 
uted among Ss and the experimenter (E) 
read the items with the correct answers. No 
discussion of the items was permitted. The 
class was told there was not sufficient time 
available for discussion. The E did not give 
any indication that a constructed response 


test would be administered the following class 
period. 

No Knowledge of Results. The Ss were 
administered the same multiple-choice test, 
but were not given access to the test again, 

Control. The Ss did not take the multi- 
ple-choice test. 


Tests Employed 


The 40 items in the multiple-choice test 
covered a number of topics in educational 
psychology. Reliability of the test estimated 
by the Spearman-Brown prophecy formula 
was 84, 

On the third class period, Ss in all treat- 
ment groups were administered the criterion 
test. This test was a constructed response 
test utilizing the same stem of the multiple- 
choice item, but requiring Ss to compose 
or recall the correct answer. Their score was 
the number of recall responses that were 
in essence the same as one of the three 
wrong alternatives in the multiple-choice 
test. The E scored these constructed re- 
sponses “blind,” in that he did not know 
to which treatment group individual Ss had 
been assigned. 

When the criterion test was scored for 
correct answers, the reliability coefficient 
estimated by the Spearman-Brown prophecy 
formula was .78. The mean number of cor- 
rect responses for the Knowledge of Results 
treatment was higher (M — 34, SD — 3) 
than the Control (M — 28, SD — 7) or the 
No Knowledge of Results (M = 31, SD = 
5). In comparing these three means, only the 
Knowledge of Results and Control differ- 
ence was significant at the .01 level (t = 
3.78; df = 46; two-tailed test). The Knowl- 
edge of Results and No Knowledge of Re- 
sults comparison yielded a t of 2.47, which 
was significant at the .05 level (df = 46; 
two-tailed test). 


RESULTS 


Means and standard deviations of 
the plausible wrong responses from 
the multiple-choice test that were re- 
called as being correct on the con- 
structed response test are given in 
Table 1. The data were collapsed 
across levels for the treatment means, 
and across treatments for the ability 
means. 

To test for differences among the 
groups, a 3 (treatments) x 2 (levels) 
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TABLE 1 
MEAN AND STANDARD DEVIATION OF 
PLAUSIBLE WRONG RESPONSES FROM 
Mouttipis-Cxorce Test RECALLED 
on CONSTRUCTED RESPONSE TEST 


Group N| M | SD 
Knowledge of results 24 | 3.54 | 2.58 
No knowledge of results | 24 | 6.33 | 2.82 
Control 24 | 3.70 | 2.51 
Low ability 36 | 5.19 | 3.14 
High ability 36 | 3.86 | 2.50 


factorial analysis of variance was 
used. An F of 8.93 (df = 2/66; p < 
.001) was obtained for treatments, and 
an F of 4.99 (df = 1/66; p < .01) was 
obtained for ability levels. The Treat- 
ment X Ability Levels interaction was 
not significant, as it gave an F of 1.49 
(df = 2/66; p > .05). 

The Duncan multiple-range test re- 
vealed the No Knowledge of Results 
treatment resulted in significantly 
more (p « .01) errors than the Knowl- 
edge of Results or Control treatments. 
The Knowledge of Results and Con- 
trol treatments did not differ signifi- 
cantly. The low-ability group made 
significantly more (p « .01) errors 
than the high-ability group. The in- 
teraction was not significant. 


Discussion 


The results of this study can be 
interpreted in at least two ways. If one 
attends to the fact that the most errors 
were made by the group that was ex- 
posed to the plausible wrong responses 
without knowledge of results, one con- 
cludes with Kaess and Zeaman (1960) 
that plausible wrong responses inter- 
fere with learning. In a recent review 
of the literature, Holland (1965, p. 
104) arrives at the same conclusion. 

However, all programs provide for 
knowledge of results, so the more im- 
portant comparison becomes the 
Knowledge of Results and the Control 


treatments. It seems noteworthy that 
in this experiment, when knowledge of 
results was given, the plausible wrong 
responses had no more effect than 
when Ss were never exposed to them. 
Consequently, from this study it 
appears that the rationale and experi- 
mental evidence on which linear pro- 
gramming is based need not neces- 
sarily exclude frames which utilize 
multiple-choice questions. 

The nature of the terminal behavior 
desired may dictate the type of re- 
sponse mode. In the shaping of par- 
ticular responses, the multiple-choice 
frame may be “out of place.” As 
Schramm (1964, p. 10) has suggested, 
there are probably some learning tasks 
for which one or the other method 
works better, and Williams’ (1963) 
distinction between items requiring 
technical terminology (where con- 
structed response was superior) and 
items requiring general, familiar vo- 
cabulary is a case in point. 

It should be mentioned that this 
multiple-choice test, and the verbal 
materials used by Kaess and Zeaman 
(1960) and Pressey (1950) are not 
programmed materials in that items 
are not arranged in logical sequence 
designed to lead the student in steps 
to the terminal behavior desired. Such 
materials have been termed “adjunct 
programming” (Holland, 1965, p. 67). 
The rationale for including these kinds 
of experiments in the field of pro- 
grammed instruction has been devel- 
oped elsewhere (Schramm, 1964, p. 1), 
but it is an obvious possibility that in 
other types of programming, results 
might differ. 
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SCHOLASTIC SUCCESS AND ATTITUDE TOWARD 
SCHOOL IN A POPULATION OF SIXTH GRADERS' 


PHILIP W. JACKSON aw» HENRIETTE M. LAHADERNE 
University of Chicago 


The study examined the accuracy of teachers’ judgments of their stu- 
dents’ satisfaction with school, and the relationship between scholastic 
success and attitude toward school. 292 6th graders responded to 2 
questionnaires that assessed attitudes toward their schools and their 
teachers. IQ, achievement test scores, and course grades were obtained. 
Each teacher estimated each of his students’ overall satisfaction with 
school. The results show that the teachers’ estimations were better than 
chance, but they were related more closely to the students’ academic 
record than to their expressed attitudes. The correlations between stu- 
dents’ satisfaction scores and scholastic scores were negligible. There 


were no sex differences. 


Success and satisfaction are bound 
together by logic, if not by fact. Logi- 
cally at least, successful people ought 
to appear satisfied, and unsuccessful 
people dissatisfied, when queried about 
the conditions surrounding their 
achievements. In educational terms, 
students who are doing well in school 
might be expected to express content- 
ment when asked to describe their 
school experience, and those who are 
doing poorly might be expected to 
express discontentment. Surprisingly, 
however, educational research has not 
yet provided a confirmation of this 
logically compelling expectation. In- 
deed, over the past 25 years an impres- 
sive amount of evidence has accumu- 
lated showing that scholastic success 
and attitudes toward school are typi- 
cally unrelated to each other (Diedrich, 
1966; Jackson & Getzels, 1959; Malpass, 
1953; Tenenbaum, 1944; Tschechtelin, 
Hipskind, & Remmers, 1940).? 

If confirmed by further investiga- 


1 Expanded version of a paper presented 
at the American Educational Research As- 
sociation meeting, Chicago, February 1966. 

?One investigator (Brodie, 1964) does 
present evidence in support of the predic- 
tion, but his results, which seem to hold 
only for the girls in his sample, are difficult 
to evaluate because of the atypical perform- 
mance of one of his experimental groups. 
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tion, the absence of a strong linkage 
between success and satisfaction in 
the classroom should provoke con- 
siderable thought among practitioners 
and researchers alike. Few of the ques- 
tions to which it gives rise are more 
intriguing than those that concern the 
beliefs and behaviors of classroom 
teachers. How sensitive, for example, 
are teachers to differences in their 
students’ views of school? When esti- 
mating the attitudes of their students 
do teachers act in accord with the 
popular expectation linking success 
and satisfaction or are their judgments 
in agreement with the results of empiri- 
cal studies? The study reported here 
was designed to provide partial answers 
to these questions and to explore 
further the general relationship be- 
tween scholastic success and attitudes 
toward school. 


SUBJECTS AND PROCEDURES 

The subjects (Ss) comprised the entire 
sixth grade of the public schools in & pre- 
dominantly white, working class suburb 
(11 classes located in 6 schools; N = 148 
boys, 144 girls). The pupils’ mean IQ, as 
measured by the Kuhlmann-Anderson 
Intelligence Test, was 103.9, with a standard 
deviation of 19.2. The 11 sixth-grade 
teachers (7 women, 4 men) also participated 
in the study by estimating how satisfied 
each of the students was with his school 
experience. 
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Toward the end of the school term, one 
of the investigators administered two at- 
titude inventories to Ss during regular class 
periods. To encourage honesty in replying, 
Ss were assured that their responses would 
not be seen by their teachers nor by anyone 
else connected with the schools. The two 
inventories were: 

1. The Student Opinion Poll II. This was 
a revision of a questionnaire developed a 
few years ago (Jackson & Getzels, 1959) 
and designed to elicit responses concerning 
general satisfaction or dissatisfaction with 
four aspects of school life: the teachers, the 
curriculum, the student body, and classroom 
procedures. The version used in this study 
contained 47 multiple-choice items and was 
scored by giving one point each time S chose 
from within a set of multiple-choices the 
response indicating the highest degree of 
satisfaction with that aspect of school life. 
Thus, the possible range of scores was from 
0 to 47. The mean scores for the sample of 
sixth graders were 25.3 for boys and 29.4 for 
girls. The standard deviation for both sexes 
was 8.2. Test reliability for the total sample, 
based on Kuder-Richardson formula 20, 
was .86. The following are sample items. 

6. The things that I am asked to study 
are of: 

a. great interest to me. 

b. average interest to me. 

c. little interest to me. 

d. no interest to me. 

Teachers in this school seem to be: 

a. fair at all times. 

b. generally fair. 

c. occasionally unfair. 

d. often unfair. 

47. In general, my feelings toward school 
are: 

a. very favorable—I like it as it is. 

b. somewhat favorable—I would like 
a few changes. 

c. somewhat unfavorable—I would 
like many changes. 

d. very unfavorable—I frequently feel 
that school is pretty much a waste 
of time. 

2. The Michigan Student Questionnaire 
(abbreviated version). This was a shortened 
form of a questionnaire developed by Flan- 
ders and his associates (Flanders, 1965) to 
assess students’ attitudes toward their 
present teachers and schoolwork. The 
version used in this study contained 37 
descriptive statements, each followed by 
four possible replies: strongly disagree, 
disagree, agree, and strongly agree. A 
student’s response to each item was scored 
4, 3, 2, or 1 depending on the degree to which 


his reply reflected a positive attitude toward 
his school and his teacher. Thus, the possible 
range of scores was from 37 to 148. The mean 
scores for the sample of sixth graders were 
101.5 for boys and 109.3 for girls. Standard 
deviations were 19.2 and 16.9 for boys and 
girls respectively. Test reliability for the 
total sample, based on a variation of the 
Kuder-Richardson formula appropriate for 
weighted scores (Ferguson, 1951) was .94. 
The following are sample items. 
12. What we learn in this class makes me 
want to learn new things. 
Strongly disagree Disagree 


Agree Strongly agree 
16. This teacher certainly knows how to 
teach. 
Strongly disagree Disagree 


Agree Strongly agree 

23. I really like this class. 

Strongly disagree Disagree 
Agree Strongly agree 

"The correlation between responses to the 
Student Opinion Poll II and the Michigan 
Student Questionnaire was .62 for the total 
sample. Thus, although the two question- 
naires provide ‘similar information, the 
relationship between them is low enough to 
justify the use of both in a study of students’ 
attitudes. 

At a special meeting held after school, 
the 11 sixth-grade teachers were shown 
sample items from the Student Opinion Poll 
II and were asked to predict how their 
students might respond to such a question- 
naire. The exact procedure for obtaining the 
teachers’ ratings was as follows. Each 
teacher was presented with an alphabetized 
list of his students. He was asked, first, to 
divide the group into thirds by classifying 
his students into three levels of satisfaction: 
"most," “average,” and 'least." He was 
then asked to identify from within the 
groups labeled “most” and “least” a smaller 
number of students (one-fourth of each 
group) who seemed to represent extreme 
positions (“very satisfied" and “very dis- 
satisfied"). Thus, each student's attitudes 
were described by his teacher as falling into 
one of five categories. In each classroom the 
approximate fractions of students in the 
five categories were: 142, 14, 1$, M, Mr 
When the ratings were treated quantita- 
tively the values 15, 12, 10, 8, and 5 were 
assigned to the five groupings, the highest 
number being used to represent the at- 
titudes of students whom the teachers had 
described as “‘very satisfied." 

Two measures of scholastic performance 
were used. The first consisted of four grades, | 
given by the S’s present teacher, in reading, — 

! 
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` TABLE 1 


CORRELATIONS BETWEEN TEACHERS’ ESTIMATES AND MEASURES OF STUDENT ATTITUDES, 
ACHIEVEMENT, AND IQ 


Attitude measures 


Achievement Measures 


E co E AE UR 
TOIT | questionaire] Peer | Tamam | Arithmetic 

Boys .289* Prod .49** 51°F .45** .44** 

Girls .27** .25** .96** ;9r** :91** .90** 


Note.—N = 148 for boys; N = 144 for girls. 


**p < Ol. 


language arts, arithmetic, and science. The 
second consisted of three scores, derived 
from the Stanford Achievement Test, in 
reading, language arts, and arithmetic. 


RzsurTS 


The findings with respect to the 
relationship between the teachers’ 
estimates of students’ satisfaction and 
Scores derived from the students" 
responses to the two questionnaires 
are presented in Table 1. That table 
also contains correlations between the 
teachers' estimates and measures of 
the students’ academic performance. 
The findings reveal that students’ 
satisfaction is at least partially visible 
to teachers and can be estimated with 
greater-than-chance accuracy. They 
also reveal, however, that teachers 
tend to expect achievement and satis- 
faction to be more closely related than 
they, in fact, are. Indeed, the correla- 
tions in Table 1 indicate that when 
teachers set out to estimate how a 
student will respond to an attitude 
questionnaire, they come closer to 
describing how well the student 
achieves in school than to how he feels 
about his school experience. 

Table 2 contains correlations be- 
tween attitudes toward school and 
measures of scholastic achievement. 
Four features of that table deserve 
attention. First, and most important, 
all of the correlation coefficients are 
small. Second, they are of the same 


magnitude with teachers’ grades as 
with achievement test scores. Third, 
although the coefficients are uniformly 
higher for the Student Opinion Poll IT 
than for the Michigan Student Ques- 
tionnaire, the difference between the 
two sets of statistics is trivial. Fourth, 
there are no significant sex differences. 
Thus, these data confirm the results of 
other investigators who have found no 
significant relationship between at- 
titudes toward school and scholastic 
achievement. Furthermore, they ex- 
tend our previous knowledge by 


TABLE 2 


CORRELATIONS BETWEEN ATTITUDES AND 
SCHOLASTIC PERFORMANCE 


Attitudes 
Scholastic Student [Michigan Student 
performance Opinion Poll II| Questionnaire 
Boys | Girls Boys Girls 
Grades 
Reading .15 | .16 .01 06 
Language .13 | .16 .01 .01 
Arithmetic .08 | .14 .00 .00 
Science .15 | .19* .06 04 
Achievement 
tests 
Reading .14 | .08 .08 | —.07 
Language .11 | .14 .02 | —.06 
Arithmetic .13 | .12 .06 | —.05 
IQ 106 | .14 | —.08| .01 
Note.—N = 148 for boys; N = 144 for 
girls. 
*p < .05. 
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demonstrating the absence of the 
success-satisfaction relationship with a 
wider variety of measuring instru- 
ments than has been used in earlier 
studies. 


Discussion 


Several conditions might account for 
the unexpected lack of relationship 
between success and satisfaction. One 
possibility is that the range and in- 
tensity of student attitudes are not as 
great as responses to the questionnaire 
indicate and, thus, are not sufficiently 
powerful to affect behavior. Perhaps 
students typically do not either hate 
school or love it but, instead, feel 
rather neutral about their classroom 
experience. Another possibility is that 
teachers and parents behave in ways 
that effectively weaken whatever nat- 
ural connection might exist between 
attitudes and achievement. In most 
classrooms students are required to 
master the minimal curricular objec- 
tives whether they want to or not. 
Assignments are clearly defined, dead- 
lines are set, and frequent checks are 
made by teachers and parents alike to 
determine whether the work is being 
completed as expected. 

Thus, in several ways teachers, 
parents, and general classroom condi- 
tions may counteract the natural con- 
sequences of differences in students’ 
attitudes. The insignificant correlations 
between responses to questionnaires 


and achievement test scores could be 
interpreted as quantitative evidence of 
the effectiveness of that counteraction. 
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EFFECTS OF VERBALIZATION AND PRETRAINING ON 
CONCEPT ATTAINMENT BY CHILDREN IN TWO 


MEDIATION CATEGORIES 


JOSEPH LEE WOLFF* 
Institute of Educational Research, Indiana University 


The study was concerned with the effects of 3 variables on concept 
attainment in an Osler-type concept-attainment task: mediational 
ability, verbalization, and pretraining. Both pretraining and verbali- 
zation had a significant positive effect on concept attainment. How- 
ever, mediational ability, which was measured by means of a task 
developed by Kendler, Kendler, and Learnard, was not related to 
performance on the concept-attainment task. Since successful perform- 
ance on this task requires that S use verbal mediators, it was con- 
cluded that the results of the study provide no support for the hypoth- 
esis that mediational ability as defined by the Kendler, Kendler, 
and Learnard task is a function of the presence or absence of verbal 


mediating processes. 


Several two-stage S-R models have 
been used to predict human behavior 
in the concept-reversal-learning task. 
In the Kendler and Kendler (1962) 
model (hereafter the verbal mediation 
hypothesis) the hypothetical mediat- 
ing response is assumed to be verbal 
or in some way verbally directed. 
House and Zeaman (1962; Zeaman 
& House, 1963), on the other hand, 
have postulated a purely perceptual 
mediator (dimensional orienting re- 
sponse), which serves simply to in- 
crease the probability that relevant 
cues will be observed. 

According to the Kendlers, the two- 
stage model fits only the concept-re- 
versal behavior of older children and 
adults (mediators); the behavior of 
younger children and rats (nonmedi- 
ators), since they do not normally 


1 This paper is based on the author's doc- 
toral dissertation, written under the direc- 
tion of N. A. Fattu. The pilot study for the 
present experiment was undertaken by the 
author in collaboration with John Murphy 
as a project for the Institute of Educational 
Research. Thanks are extended to Mr. 
Murphy for building the apparatus and as- 
sisting in the conduct of the pilot study and 
to Daphne Marlatt, who served as the second 
experimenter, 

? Now at the University of Illinois. 
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mediate in solving such tasks, may be 
best described in accord with a tradi- 
tional single-stage S-R model such as 
Spence’s (1936). Greatly facilitating 
evaluation of the Kendlers’ theory is 
a test technique (hereafter known as 
the mediation test), designed by 
Kendler, Kendler, and  Learnard 
(1962) to identify children as either 
mediators or nonmediators. Thus, 
should children differentially categor- 
ized by the mediation test be given a 
motor discrimination task known a 
priori to be facilitated by verbal 
mediation, it could be predicted from 
the Kendlers’ theory that mediators 
would do better than nonmediators. 
Confirmation of this prediction would 
support the verbal mediation hy- 
pothesis, tend to validate the media- 
tion test and at the same time, inci- 
dentally, pose a problem for the 
dimensional orienting response model, 
which does not postulate a verbal 
mediating process; contrariwise, fail- 
ure to confirm such a prediction would 
cast doubt on the validity of the 
mediation test, disparage somewhat 
the verbal mediation hypothesis and 
tend to support the dimensional ori- 
enting response model. 
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To test the above prediction chil- 
dren identified by the mediation test 
as mediators and nonmediators were 
given an Osler-type concept-attain- 
ment task (Osler & Powell, 1960). 
Thus, only when the subject (S) at- 
taches a common label to all members 
of the set of positive stimuli and 
motor-responds to the verbal SP so 
produced (i.e., mediates) is he likely 
to solve the task in the allotted num- 
ber of trials. It was expected from the 
verbal mediation hypothesis that me- 
diators would do better than nonme- 
diators on the concept-attainment 
task (Prediction 1). 

In addition to mediation test cate- 
gory, two other variables were em- 
ployed to form a 2 x 2 x 3 factorial 
experiment (4 Ss per cell in the medi- 
ator groups; 6 Ss per cell in the non- 
mediator groups).  Verbalization 
(overt condition versus nonovert con- 
dition) was chosen as one of these 
variables for two reasons: (a) To 
check, under a concept-attainment ar- 
rangement, a previous finding that 
overt verbalization facilitates: per- 
formance in a discrimination learning 
situation (Weir & Stevenson, 1959). 
It was expected that the overt verbali- 
zation condition would be superior 
to the nonovert verbalization condi- 
tion (Prediction 2). (b) To determine 
the differential effect (if any) of overt 
verbalization on the concept-attain- 
ment processes of mediators and non- 
mediators. Kendler (1963) has sug- 
gested that nonmediators do not 
mediate simply because they do not 
verbalize during the discrimination 
task. It was expected, therefore, that 
overt verbalization would facilitate 
the performance of nonmediators more 
than it would that of mediators (Pre- 
diction 3). - 

In introducing the third factor o: 
the experiment—pretraining—an at- 
tempt was made to assess the transfer 
effect of a preestablished mediated 
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connection on concept attainment, as 


well as the differential effect of this - 


connection in mediators and nonmedi- 
ators. In Pretraining Condition 1 
(PTC 1) the response-produced verbal 
cue, “bird,” was (hypothetically) con- 
ditioned to an overt motor response of 
button pushing by means of a simple 


discrimination problem involving the | 


concept-attainment task apparatus 
(cf. Jeffrey, 1953). Because of this 
connection, it was thought, any stimu- 
lus in the concept-attainment task 
(given immediately after) which elic- 
ited the verbalization “bird” should 
also elicit a button-push response as 
well. Since the concept to be attained 
was “bird,” positive transfer was ex- 
pected from PTC 1. 

Pretraining Condition 2, which 
served as a specific control for PTC 
1, was the same as PTC 1, except that 
“shoe” rather than “bird” was condi- 
tioned to the button-push response. As 
no shoe-pictures appeared as stimuli in 
the concept-attainment task, transfer 
from this connection per se was ex- 
pected to be neither positive nor nega- 
tive. It was predicted, therefore, that 
PTC 1 would be superior to PTC 2 
(Prediction 4). Pretraining Condition 
3 was simply a control for assessing 
all remaining transfer effects involved 
in the pretraining process itself, in- 
cluding warm-up, and consisted of no 
pretraining at all. All other PTC, it 
was expected, would be superior to 
PTC 3 (Prediction 5). 


METHOD 


Apparatus 


Since the Kendler, Kendler, and Learnard 
mediation-test apparatus closely resembles 
that used in the present concept-attainment 
task, their equipment was modified to avoid 
unwanted transfer effects. The present me- 
diation-test apparatus incorporates all essen- 
tial features of the Kendler, Kendler, and 
Learnard equipment but was modeled after 
that used by Kendler and Kendler (1959). 
As the same theoretical principles are alleged 
to hold for both apparatus, this change was 
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not thought to seriously alter the character of 
the test. Apparatus for the mediation test 
consisted of four opaque tumblers and a 
small gray-enameled turntable. The turn- 
table was made of a 17-inch (diameter) cir- 
cular board across the middle of which was 
placed a 10 X 17-inch plywood screen. Black 
rubber grommets 6 inches apart held the 
marble rewards which were baited under the 
tumblers between trials. Two of the tumblers 
were flat black and two were flat white. One 
of each color was 4.2 inches tall, 1.9 inch wide 
at the top and 2.8 inches wide at the base; the 
other two were 2.1 inches tall, 1.9 inch wide 
at the top and 2.5 inches wide at the base. 

Apparatus for the pretraining and con- 
cept-attainment tasks were the same; only 
the stimuli were different. The gray-enam- 
eled wooden apparatus consisted of a 30 X 
24-inch panel in which were centered two 
windows (2.4 inches in diameter) 7 inches 
apart. Side panels (24 X 12 inches) sup- 
ported the front panel while hiding the 
first experimenter (E-1), who operated the 
apparatus from the rear. All edges and 
windows were bordered in blue. Below each 
window (1.5 inch) was a red response button. 
Behind the panel, a light signaled S’s re- 
sponses and enabled E-1 to administer rein- 
forcement appropriately. Marble rewards 
were delivered through a hole midway be- 
tween the two buttons, 5 inches beneath 
them, and were caught in a plastic cup 1 
inch below. Delivery of the marble, as well 
as presentation of the stimuli, was manually 
controlled. Between trials, E-2 (who alone 
interacted with S in the session) lowered a 
green shade to cover the windows. 

Stimuli for pretraining were a metallic 
blue equilateral triangle (1 inch sides), a 
red square (1.5 inch sides), and a yellow 
circle (1 inch diameter), All figures were 
made of construction paper and were glued 
to white backgrounds. 

The concept-attainment task stimuli con- 
sisted of a pack of 20 different cards, on each 
of which were pasted two approximately 
2-inch multicolored pictures (seals) spaced 
to fit the windows of the concept-attain- 
ment task apparatus. Each card presented 
pietures of a bird and either an animal or 
flower—the bird seals appearing equally of- 
ten on the right and left sides of the cards 
in a prearranged nonsystematic sequence. 

Small toys were given as prize incentives 
after the mediation test as well as after the 
pretraining and concept-attainment tasks. 


Overview of Procedure 


Table 1 shows the design of the experi- 
ment and order of procedure. A detailed 


description of the experimental activities is 
presented in the next three sections. 
Mediation Test Procedure 

All Ss were administered the mediation 
test 2 to 3 days before the concept-attain- 
ment task. Procedure for administration 
conformed strictly to that prescribed by 
Kendler, Kendler, and Learnard with the 
following exceptions: (a) Instructions were 
modified to make them apply better to the 
present apparatus and were slightly short- 
ened. (b) Cut-off for Series 1 was placed at 
100 trials rather than at 150 trials. Two Ss 
were thus eliminated (one trained on size, 
the other on brightness). The full procedure 
is described in Kendler, Kendler, and Learn- 
ard (1962). Below, however, is a brief de- 
scription. 

The test was divided into three parts, 
In Series 1 the stimulus pairs (tall-white, 
short-black) and (tall-black, short-white) 
were presented one pair per trial in a prear- 
ranged sequence, and, for any given S, one 
of the four cues (i.e. black, white, tall, or 
short) was 100% reinforced by means of 
marbles baited under the positive stimuli. 
[Prior to Series 1, the positive cue for that 
series, as well as the stimulus pair to be 
used in Series 2, was determined randomly 
for each S with the restriction that each cue 
should be positive for an equal number of 
Ss during Series 1 and that the two stimulus 
pairs (ie., tall-white, short-black, and tall- 
black, short-white) should be used with 
equal frequency as the Series 2 discrimi- 
nanda,] Although the position of the posi- 
tive stimulus normally alternated over trials 
from left to right throughout the series ac- 
cording to a prearranged pattern, stimuli re- 
mained constant over trials whenever S 
made an error, All Ss worked to a criterion 
of 9 out of 10 consecutive correct responses, 
at which point they were immediately 
shifted to Series 2. 

In Series 2 only one of the stimulus pairs 
used in Series 1 was presented and the previ- 
ously negative stimulus now became posi- 
tive. Following S's achievement of another 
9 out of 10 criterion on this reversed dis- 
crimination, Series 3 commenced. 

In Series 3 the pair of stimuli not used in 
Series 2 was reintroduced and alternated 
with the Series 2 pair in the same sequence 
as was used in Series 1. (Series 3, however, 
had a fixed length of 20 trials, each pair 
appearing 10 times.) Reinforcement of the 
stimuli used in Series 2 conti as in 
Series 2; both new stimuli (test pair Were... 
baited, however, in@rder to avoid differential 
reinforcement, The “Series 2 stimuli. thus 


Nonovert-verbaliza- 
tion condition 
PTC1 
M 
N-M 


PTC2 
M 
N-M 
PTC3 
M 
N-M 


Mediation test 


Mediation test 


Mediation test 


Learn s'yira-Rpset 


Learn s“shoo"-Rpush 


No pretraining 
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TABLE 1 
EXPERIMENTAL DESIGN AND ORDER OF PROCEDURE 
Session 2 
Group Session 1 
Pretraining Concept-attainment task 
Overt-verbalization 
condition " 

PTC1 Mediation test Learn S"bira-Rpush Label stimulus before 

M responding 

N-M 

PTC2 Mediation test Learn 8,5977 E push Label stimulus before 

M responding 

N-M 

PTC3 Mediation test | No pretraining Label stimulus before 

M responding 

N-M 


Respond without 
speaking 


Respond without 
speaking 


Respond without 
speaking 


Note.—M = mediator; N-M = nonmediator; PTC = pretraining condition. 


served to keep S responding to the same cue 
as that to which he responded in Series 2. 
His responses to the test pair disclosed this 
cue (e.g, black). On its basis S was consid- 
ered to have made either a reversal shift 
(8 of 10 choices of the test stimulus which 
was incorrect in Series 1; such a response 
pattern indicates that S had learned to re- 
spond in Series 2 to the cue which was 
negative in Series 1), a nonreversal shift 
(8 of 10 choices of the test stimulus which 
was correct in Series 1; such a pattern indi- 
cates that S had learned to respond in Series 
2 to a cue within the dimension—e.g., size 
—which was irrelevant in Series 1), or an 
inconsistent shift (less than 8 of 10 choices 
of either member of the test pair), Follow- 
ing the procedure of Kendler, Kendler, and 
Learnard, Ss making reversal shifts were 
classified as mediators, and the remaining 
Ss were classified as nonmediators. 

After Series 3, S was shown the stimuli 
of Series 2 and asked “Which one is the 


winning cup?” and upon no verbal response, 
“What does it look like?” A third query, 
“How do you know?” used by Kendler, 
Kendler, and Learnard was inadvertently 
left out. The child was then given a prize 
and cautioned against speaking of the pro- 
ceedings. 


Pretraining Procedure 


On the second or third day after admin- 
istration of the mediation test, each S who 
received pretraining (ie., each S in PTCs 1 
or 2) was brought to the same room as be- 
fore and was immediately seated faced away 
from the concept-attainment task apparatus 
and taught the “names” of the three pre- 
training stimulus forms, The Ss in PTC 1 
learned to name the triangle form, “bird, 
and PTC 2 Ss learned to name the same 
form, "shoe." Both groups learned to name 
the square form, “house,” and the circle 
form, “cookie.” A typical anticipation 
paired-associates procedure was used to 
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teach the names of the forms and the cri- 
terion of learning was 15 consecutive correct 
trials. 

Following this preliminary portion of the 
pretraining period Ss were given a two- 
choice simultaneous discrimination involv- 
ing the three forms. The discrimination was 
presented by means of the concept-attain- 
ment task apparatus, the forms appearing 
in the two windows of the apparatus and 
Ss responding to the buttons directly under 
the forms. The circle and square forms 
alternated over trials as the negative stimu- 
lus, and the triangle form, which appeared 
in the left and right windows according to 
a Gellerman series, was always the positive 
stimulus. As in the mediation test, however, 
stimuli changed between trials only after a 
correct response. Correct responses were re- 
inforced by means of marble rewards. 

All Ss were required to correctly name 
whichever stimulus they chose before re- 
sponding to it and were prompted when 
unable to do so. In addition Ss were made 
to repeat their choices whenever they re- 
sponded before verbalizing. This additional 
trial was not counted, however. It was ex- 
pected that this procedure would serve to 
condition the overt response of button push- 
ing to the stimulus “bird” in the case of PTC 
1 and “shoe” in the case of PTC 2. 

Following achievement of criterion (9 out 
of 10 consecutive correct responses) Ss in 
PTCs 1 and 2 were given a prize and were 
immediately introduced to the concept-at- 
tainment task, Those in PTC 3 (i.e., the no 
pretraining condition) began the concept- 
attainment task immediately upon entering 
the room, 


Concept-Attainment Task Procedure 


Procedure for the concept-attainment 
task was similar to that of the pretraining 
discrimination in that stimuli appeared in the 
windows of the apparatus and Ss were re- 
quired to respond to the buttons directly un- 
der the windows. In the concept-attainment 
task, however, a picture of a bird appeared in 
one window on each trial and a picture of 
some living thing other than a bird appeared 
in the other. Moreover, rather than the same 
positive stimulus and one of two negative 
stimuli appearing on each trial, as was the 
case in pretraining, within each block of 20 
trials a different positive stimulus (bird pic- 
ture) and negative stimulus (nonbird pic- 
ture) appeared on each trial (i.e., there were 
pictures of 20 different birds and 20 different 
nonbirds and these pictures were repeated 
every 20 trials). As in pretraining, the posi- 


tive stimulus appeared equally often in the 
left and right windows of the concept-attain- 
ment task apparatus. 

Prior to the task all Ss were read common 
instructions regarding general task proced- 
ure. In addition, Ss in the overt verbaliza- 
tion condition were instructed to label the 
picture to which they wished to respond be- 
fore responding. (Unlike in pretraining, 
however, Ss were free to apply any label to 
the stimulus that they wished.) The Ss in 
the nonovert verbalization condition were 
instructed to respond without speaking. All 
Ss attempted to attain the concept "bird," 
and attainment of the concept was consid- 
ered to have occurred when S responded to 
a bird picture on 9 of 10 consecutive trials 
(criterion). 

In Trials 1-3, which were not counted, E 
corrected any mistakes in S's procedure. 
Every tenth trial E said, “Never push the 
losing button if you can help it; try to win 
a marble every time.” Any S in the overt 
verbalization condition who failed to name 
a picture before responding to it was forced 
to name the picture and respond to it again 
immediately. This additional trial was not 
counted, however. At criterion or 100 trials 
the problem was stopped; S was then given 
his prize and cautioned not to tell anyone 
how to play the “game.” This caution termi- 
nated the session. 

Subjects 

Sixty children (26 boys and 34 girls) 
drawn from three first-grade classes at the 
University Elementary School formed the 
sample. In addition to these children, six 
other Ss participated whose data were not 
reported. Two of these Ss failed the media- 
tion test and were eliminated for this 
reason; in addition four Ss were randomly 
eliminated to obtain proportionality. Of the 
60 children for whom data are reported, 24 
were mediators and 36 were nonmediators. 
Within these mediation categories Ss were 
randomly assigned to either the overt or 
nonovert verbalization conditions and to 
one of three PTCs (four Ss per cell in the 
mediator group and six Ss per cell in the 
nonmediator group). 


RESULTS AND DISCUSSION 
Concept-Attainment Task 


No significant differences between 
mediators and nonmediators nor be- 
tween PTCs 1 and 2 were found in 
the pretraining data. Table 2 presents 


TABLE 2 
MEANS AND VARIANCES OF TRIALS TO 
CRITERION FOR TWELVE GROUPS ON 
THE CONCEPT-ATTAINMENT TASK 


Mediators Nonmediators 
Vari- Vari- 
Mean | ance | Mean | ance 
"Overt-verbalization 
condition 
PTC1 2.0 6 29. 1100 
PTC2 22.0 1206 23.6 600 
PTC3 63.0 953 56. 542 
Nonovert-verbaliza- 
tion condition 
PTC1 63.8 1694 58.0 1539 
PTC 2 65.8 403 89.1 403 
PTC 3 87.5 469 66.6 742 
“Total 50.7 1930 53.6 1347 


Note.—N = 4 for the mediators; N = 6 for the non- 
mediators. 


the mean trials to criterion and vari- 
ances of 12 groups on the concept- 
attainment task. A Hartley test in- 
dieated extreme nonhomogeneity of 
variance (Fmax(12,5) = 2706.5, p < .001). 
Since an analysis of variance on these 
data revealed neither a main effect for 
mediation test category (F < 1) nor 
any significant interaction involving 
this variable, rather than risk losing 
a meaningful evaluation of the Ver- 
balization X Pretraining interaction 
by transforming the scores, it was 
elected to collapse over the mediation 
test variable in order to eliminate the 
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nonhomogeneity in all subsequently 


reported analyses. Table 3 shows the 
new means and variances resulting 
from the collapse. 

An Fmax test on the new variances 
was not significant (Fmax(6,9) = 2.25, 
p > .10). The results of an analysis 
of variance on these data indicated 
that both verbalization (F (1,54) = 
20.3, p < .001) and pretraining (F 
(2,54) = 3.59, p < .05) produced 
significant effects. The interaction was 
not significant (F (2,54) = 2.01, p > 
.10). 

Since Predictions 4 and 5 deal spe- 
cifically with the relative ordering of 
the PTC means, individual compari- 
sons among the three PTCs were con- 
ducted. In line with Prediction 5, PTC 
1 differed significantly from PTC 3 
(F (1,54) = 7.36, p < .01); contrary 
to the same prediction, PTC 2 was 
not generally superior to PTC 3 (F 
(1,54) = 2.30, p > .05). 

Contrary to Prediction 4, PTC 1 
was not significantly different from 
PTC 2 (F = 1.33). The failure of this 
prediction to be fulfilled is somewhat 
anomalous in that it was strongly sup- 
ported by a pilot study, particularly 
in overt-verbalization condition (p < 
.001). Because there were age differ- 
ences in the pilot-study and parent- 
study Ss (pilot-study Ss were nursery 
schoolers), as well as changes in the 
concept-attainment task procedures 


TABLE 3 


MEANS AND VARIANCES OF TRIALS TO CRITERION FOR SIX GROUPS ON THE 
ConcEPT-ATTAINMENT TASK 


PTC1 PTC2 PTC3 "Total 
Mean | ice | Mean | Yate | Mean | Tazz | Mean | ce 
Overt-verbalization condition 18.6 | 845 | 23.0 | 843 | 59.2 | 716 | 33.6 | 1171 
Nonovert-verbalization condition | 60.3 | 1607 | 79.8 | 1041 | 75.0 | 737 | 71.7 | 1239 
Total 39.5 | 1748 | 51.4 | 1841 | 67.1 | 830 


Note.—All cells had N — 10. 
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used, PTCs 1 and 2 (overt verbaliza- 
tion condition) were replicated with 
38 Ss from the University Nursery 
School (median age = 54 months; 
range 50-61 months) to isolate the 
effect of age. Again no difference was 
found between PTC 1 (X = 64.3) and 
PTC 2 (X = 60.0), £ < 1. Thus, the 
present results provide no support for 
the hypothesis that the spira—Ropusn 
connection theoretieally involved in 
concept attainment is in any way 
strengthened through the pretraining 
procedure. Why this hypothesis was 
supported in the pilot study is not 
clear, although the pilot and present 
study differed in a large number of 
small details, any one of which might 
have been responsible for the discrep- 
ant results. 

Though the data clearly show that 
overt verbalization facilitates concept 
attainment (and hence support Pre- 
dietion 2), why it does is open to 
speculation. Weir and Stevenson’s 
(1959) explanation is that verbaliza- 
tion requires S to orient to the rele- 
vant stimulus and thus enhances the 
likelihood of his sampling the correct 
cues, making the correct hypothesis, 
etc. This suggests, however, that overt 
verbalization condition superiority 
would be greatest in PTC 3, since both 
PTCs 1 and 2 also orient S to the ap- 
propriate stimuli. But Table 1 shows 
that the effect of verbalization is very 
much weaker in PTC 3 than in either 
of the other two conditions. Thus, 
some doubt is cast on this explanation. 

Another hypothesis, suggested at 
the beginning of this paper, is that in 
the overt-verbalization condition a 
number of Ss are forced to label the 
stimuli who would not have done so 
otherwise. Since labels facilitate per- 
formance, the force of this explanation 
is apparent. On the other hand, it may 
be that overt verbalization simply in- 
creases the salience and discrimina- 
bility of the verbal cue needed for 


TABLE 4 
FREQUENCY OF SUBJECTS FALLING IN THREE 
CATEGORIES AS A FUNCTION OF THE DI- 
MENSION RELEVANT IN SERIES 1 OF 
THE MzDrATION Test 


z 3 Dimension relevant 
Third series choice 
category 


Size Brightness 
Reversal 19 5 
Inconsistent y 10 
Nonreversal 5 14 


concept attainment rather than actu- 
ally generates it. Both interpretations 
account for the Weir and Stevenson 
data as well as the present results. 
Further research is needed, it would 
seem, to choose between them. 

Both predictions involving the medi- 
ation test variable, it was previously 
pointed out, were unconfirmed by the 
exploratory 2 X 3 X 2 analysis of 
variance. Thus, even though accepting 
the null hypothesis is a questionable 
tactic, it would appear that the Kend- 
lers’ mediation test is invalid for pre- 
dicting the present concept-attainment 
task. Either of two tentative conclu- 
sions may be drawn from this infer- 
ence: (a) The differential propensity 
to use verbal mediators measured by 
the mediation test is limited to simple 
tasks similar to the test itself. (b) 
Contrary to the verbal mediation hy- 
pothesis (and in line with the di- 
mensional orienting response model) 
shift preference and ease of shift in 
concept-reversal tasks are not always 
funetions of the presence or absence 
of verbal mediation. The mediation 
test data, as will now be shown, are 
rendered more intelligible by the latter 
alternative. 

Aecording to the verbal mediation 
hypothesis, whether an S chooses to 
make a reversal or nonreversal shift 
in the mediation test (i.e., responds as 
a mediator or nonmediator) depends 
on whether or not his behavior during 
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the test is guided by implicit verbal 
responses. Kendler (1964) has found, 
however, that choice of shift is de- 
termined to some extent at least by 
dimensional salience, and in the pres- 
ent mediation test dimensional sali- 
ence exercised a considerable influence 
on this variable (x?) = 12.89; see 
Table 4). This fact would seem to 
render nugatory the contention that 
shift choice (in the present, mediation 
test at least) is determined by the 
presence or absence of verbal medi- 
ators. It, should be pointed out, how- 
ever, that such a fact does not prove 
that verbal processes have little or no 
influence on shift behavior. Indeed, 
Ss may respond primarily to the pre- 
potent dimension (here, size) because 
dimensional orientation is verbally 
controlled (as the verbal mediation 
hypothesis maintains) and size rather 
than brightness labels are dominant 
in the 6-year-old's verbal hierarchy. 
Interestingly enough, however, while 
more Ss responded to size than to 
brightness cues in Series 3 of the medi- 
ation test, (size — 33, brightness — 10; 
p < .001, binomial test), the majority 
of Ss who verbalized a unique di- 
mension labeled the stimuli with 
brightness words (brightness = 31, 
size = 12; p < .01, binomial test). 
Thus, motor choice, far from being 
guided by, seems actually to be con- 
trary to the particular psychological 
processes determining verbalization. 
Since 40% of Ss were allegedly “medi- 
ators” this finding is difficult to recon- 
cile with the Kendlers’ theory. In con- 
sideration of this fact and the lack of 
relation between the mediation test 
and concept-attainment task found 
earlier, the present data would appear 


to be more in line with the dimensional | 


orienting response model of Zeaman 
and House (1963) than the verbal 


mediation hypothesis of Kendler and | 


Kendler (1962). 
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Scores from 16 tests, 2 for each of 8 abilities (General Reasoning, 
Verbal Comprehension, Induction, Deduction, Spatial Scanning, Per- 
ceptual Speed, Rote and Span Memory), and 18 scores from con- 
cept-attainment and information-processing tasks were obtained from 
each of 94 female Ss enrolled in educational psychology at the 
University of Wisconsin. The 34 task and ability variables were inter- 
correlated, then factored using Alpha factor analysis. The 12 Alpha 
factors were rotated to an oblique solution according to the Harris- 
Kaiser criterion. 3 abilities (General Reasoning, Induction, and Verbal 
Comprehension) were found to be related to 3 concept-attainment 
and information-processing factors. The concept-attainment and in- 
formation-processing tasks were seen to be relatively distinct rather 


than unitary activities. 


In the scientific study of human 
learning, variables have been classi- 
fied as input, intervening, and output. 
Although this classification is appli- 
cable to the study of concept attain- 
ment, a further breakdown of these 
categories is more typical of labora- 
tory experiments in concept attain- 
ment completed thus far: (a) stimu- 
lus, referring to variables associated 
with the material in which the con- 
cepts are embedded; (b) instructions, 
referring to information presented to 
the subjects (Ss) in oral or written 
form concerning the task, procedures, 
and the like; (c) organismie variables 
of a cognitive nature, referring to the 
abilities, traits, etc., inferred from the 
test performance of the Ss; (d) organ- 
ismie variables of a motivational 
nature, dealing with incentives, rein- 
forcements, sets, etc.; and (e) re- 
sponse. While the goal of a compre- 


1The research reported herein was per- 
formed pursuant to a contract with the 
United States Office of Education, Depart- 
ment of Health, Education, and Welfare. 

This article is based on the doctoral dis- 
sertation of the senior author. 
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hensive program of research? is to 
ascertain functional relationships 
among variables in all of the cate- 
gories, the present study focuses pri- 
marily on organismie variables. 
Such a program of research differs 
from that of Hovland (1952) and 
Hunt (1962), who have treated con- 
cept attainment as information proc- 
essing. Their approach has been to 
treat concept learning and information 
processing as synonymous. "Testing 
this line of thinking, Tagatz (1963) 
found only a small positive correla- 
tion, .22, between, information proc- 
essing and concept attainment. Tagatz 
defined information processing as the 
ability of an S to specify the inclusion, 
exclusion, or indeterminance of a card 
to membership in a group of cards 
which are specified as belonging or 
not belonging to a concept. He defined 
concept attainment as the ability to 
utilize information from positive and/ 
or negative instances in determining a 


*Concept learning and related cognitive 
abilities is the main focus of research of the 
R & D Center for Learning at the Univer- 
sity of Wisconsin. 
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concept. (See pp. 29-30 for a more de- 
tailed description of concept attain- 
ment and information processing as 
defined in this study.) 

Despite the substantial amount of 
research on concept attainment, the 
relationship of organismie variables 
to concept attainment has been given 
little attention. This investigation is 
specifically concerned with identifying 
those factors or abilities that are 
highly related to concept attainment 
and information processing. After a 
review of the hypothesized factors of 
intellect and tests intended to measure 
them, eight factors were selected for 
study based on their presumed rele- 
vance or lack of relevance to concept- 
attainment and information-process- 
ing tasks. 

A brief description of these eight 
factors, based on French, Ekstrom, 
and Price (1963), and an indication 
of their presumed relationship to con- 
cept attainment will clarify the ra- 
tionale of the present study. 

Rote Memory is defined as the abil- 
ity to retain bits of unrelated mate- 
rial. When an S first encounters a 
large amount of stimulus material of 
the type used in this study, he may 
perceive it as being unrelated (al- 
though it is actually highly related). 
To attain a concept wherein informa- 
tion is tested successively, as in the 
present study, an S must be able to 
retain the information. Some type of 
memory appears to be required for 
efficient concept attainment as defined 
in Task 1 of this study. (Task 1 is 
described more fully in the next sec- 
tion.) 

Span Memory involves the ability 
to recall perfectly for immediate pro- 
duction a series of items after only 
one presentation. Although the mode 
of presentation was simultaneous in 
the concept-attainment tasks, S's iden- 
tification of instances as exemplars or 
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nonexemplars was sequential in Task | 


1. The instances’ sequential identifi- 
cation by number was seen as a basis 


for including the Span Memory fac- | 


tor. 
Perceptual Speed involves speed in 


finding figures, making comparisons, | 


and carrying out other very simple 
tasks involving visual perception. 
Both concept-attainment tasks and 
the information-processing tasks of 
the present study presumably require 
the ability to make comparisons of 
figural material. The S must discrimi- 
nate among visual stimuli and make 
comparisons in order to secure essen- 
tial information. 

General Reasoning is the ability to 
solve a broad range of problems that 
require production of a generally ac- 
cepted correct solution, including 
those of a mathematical nature. Al- 
though the stimulus material in the 
present study is not mathematical, 
attaining a concept presumably re- 
quires the ability to compare informa- 
tion and to arrive at a correct solu- 
tion. 

Deduction involves the ability to 


reason from stated premises to their | 


necessary conclusions. In the present 
study, the  information-processing 
tasks might presumably have required 
this ability in that propositions of à 
similar type comprised the items of 
the information-processing task: If 


the focus card is a member of the | 


group, and the second stimulus card 


is also, does the third card definitely | 


belong to the group, definitely not 
belong to the group, or can its mem- 
bership not be determined? 


Induction probably involves several 
abilities associated with the finding of | 


general concepts that will fit sets of 
data, the forming and trying out of 


hypotheses. The concept-attainment | 


tasks presumably involved these abili- 
ties directly; to a lesser extent the in- 
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formation-processing tasks presum- 
ably did also. 

Spatial Scanning requires the abil- 
ity to explore visually a wide or com- 
plicated spatial field. Finding one’s 
way through a paper maze is a test 
of this ability. A planning ability may 
also be involved. If this ability is 
related to any task in the present 
study, it should presumably be con- 
cept-attainment Task 1, not the other 
performance tasks. 

Verbal Comprehension is the ability 
to understand the English language. 
The importance of the factor in both 
the British factor hierarchy and the 
Thurstone studies suggested its in- 
clusion in the investigation. 

The present study, then, was de- 
signed to clarify relationships among 
cognitive abilities, information proc- 
essing, and concept attainment. A fac- 
tor analysis, to an oblique criterion, 
of the ability (cognitive variables) 
and task (information-processing and 
concept-attainment criterion vari- 
ables) scores provided the model for 
the study. The matrix of intercorrela- 
tions of ability and task factors result- 
ing from the oblique factor rotation 
provided the desired relationships. 


METHOD 


Tasks 


Two concept-attainment tasks and an in- 
formation-processing task were utilized in 
the experiment. The first task, a concept- 
attainment task, is described by Harris 
(1963) : 

The experimenter (E) chooses the 
concept to be attained. He then points 
out a card on a display of 64 cards that 
belongs to this group; this will be 
called the focus-card or (F). The sub- 
ject (S) is then directed to choose one 
card at a time from the display; for 
each choice E responds “yes” if the 
chosen card belongs to the group, “no” 
if it does not. S is told initially to 
specify the defining characteristics of 
the card group (the concept) when- 


ever he believes he has solved the prob- 

lem; if his “hypothesis” is correct, the 

game terminates; if not, he is told “no” 
and asked to continue the card choice. 

For such play, S's card choices in the 

order in which he makes them, and the 

hypotheses he offers in their order and 
in relation to his card choices, or solu- 
tion, may be recorded. 

In the second concept-attainment task S 
was confronted with a small number of 
cards presented simultaneously. Unless re- 
dundancy was introduced, only the mini- 
mum number of cards necessary to attain 
the concept was presented. Thus, S was pre- 
sented a small display and was requested to 
state the concept into which all the cards 
could be classified. The information-process- 
ing task is described by Tagatz (1963): 

That part of the experiment dealing 
with information processing consisted 
of the S responding to 60 items. The 
first 30 were of the type in which one 
card, either an exemplar or a non- 
exemplar, was presented in addition to 
the exemplar focus card. The task was 
to specify the inclusion, exclusion, or 
indeterminateness of a third card to 
membership in a group of cards ex- 
emplifying the concept. The problems 
in which exemplars were presented 
numbered 15 and those presenting non- 
exemplars also 15. Of these 15 exemplar 
items, ten cards whose membership was 
to be determined were definitely ex- 
emplars of the same concepts as the 
focus card. The membership of the re- 
maining 5 test cards could not be de- 
termined. In the 15 problems present- 
ing a focus card and a non-exemplar, 10 
test cards were definitely not, exemplars 
and 5 were again not determinable. 
Thus these 30 items could be scored on 
the basis of test membership exemplar, 
non-exemplar, and indeterminate. 

The second set of 30 items was con- 
structed, using the same focus card and 
test cards as in the first subtest. The 
information presented in addition to 
the exemplar focus card consisted of two 
cards instead of one as was the case 
with the first subject. One of the two 
cards for each problem was an addi- 
tional exemplar; the other was the same 
in kind as its counterpart in the first 30 
items. The answers to the 30 items of 
the second subtest were identical to 
those in the first subtest. Thus, the two 
subtests were the same except that the 
information presented about the test 


30 
TABLE 1 
Summary DESCRIPTION Or EIGHT 
InFoRMATION-PROCESSING SUBTESTS 
edle d | eee We z 
1 yes | — | Inclusion 10 
2 yes | — | Indefinite 5 
3 no — | Exclusion 10 
4 no — | Indefinite 5 
5 yes | yes | Inclusion 10 
6 yes | yes | Indefinite 5 
7 yes | no | Exclusion 10 
8E yes | no | Indefinite 5 
Desa BE 


card in the second subtest included the 

additional complexity of one card. 
Table 1 gives the design of the arrangement 
of exemplar and/or nonexemplar instances 
and number of cards or items per test. As 
shown in Table 1, this information-process- 
ing task resulted in eight subscores of in- 
formation processing based on the type of 
information contained in the stimulus mate- 
rial. 

Task 1, in which S selected instances from 
a total display, had these measures of 
efficiency: time required to attain the con- 
cept, an index of manifested information, 
and total number of cards chosen prior to 
attaining the concept. Task 2, wherein a 
minimally sufficient set of information was 
presented to S, had time to attain the con- 
cept as the index of efficiency. In the in- 
formation-processing task, scores on each 
of eight subtests were performance criteria. 
The index of manifest information in Task 
1 was defined as amount of information 
manifested in the first hypothesis, or state- 
ment of the concept, from that potentially 
obtained, Thus, if five bits were potentially 
obtained but only three were manifested 
in the hypothesis, the index was .60. The 
Ss were asked for a concept after their sixth 
eard choice if one had not been previously 
offered. 

Stimulus materials used in this investi- 
gation were patterned by Byers (1961) after 
the Wisconsin Card Sorting Task. The con- 
cept-attainment display consisted of an 
ordered arrangement of attributes, by rows 
and columns, which formed a 8 X 8 array 
of 64 cards. On every card six attributes were 
presented by one of two defining char- 
acteristics. These six attributes and their 
dichotomous defining characteristics were: 

1. Border number: one—two 

2. Border continuity: solid—broken 
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3. Figure number: one—two 

4. Figure size: large—small 

5. Figure color: red—green 

6. Figure shape: circle—ellipse 
Each card was different from all other 
eards on the display in at least one of the 
dichotomous defining characteristics. 


Tests 


A battery of 16 ability tests, two for each 
of eight factors previously mentioned, was 
administered to all Ss. The tests were from 
the Reference Kit for Cognitive Factors 
(French et al., 1963). Table 2 presents the 
factor, number of items, and reliability 
coefficients for each test. Reliability coeff- 
cients for variables measuring Perceptual 
Speed and Spatial Scanning factors are not 
reported because of the spuriously high co- 
efficients resulting from the obvious speeding 
of these tests. 


Subjects 


The Ss for the study were 94 graduate 
and undergraduate females from two educa- 
tional psychology classes. All Ss participated 
in four sessions of group testing and a fifth 
individualized concept-attainment session. 
All Ss were in the age range of 20-35 and 
had a median age of 21 years. 


Procedure 


Each S attained a set of two selection 
concepts and a set of four presented con- 
cepts. For the first two concepts, S selected 
instances from a display in which all the 
stimulus material was presented simulta- 
neously. The concepts to be attained were 
of two and three relevant attributes, fot 


example, small, red figures; two borders, | 


small, red figures. Ss were randomly assigned 
to one of two sequences in order to contro 
for the complexity effect of the concepts 
The sequences were: two-attribute concept, 
three-attribute concept; or three-attribute 
concept, two-attribute concept. 

After attaining the first two selection con 
cepts, each S attained four concepts from 
a minimally sufficient set of simultaneously 
presented information. These four concepts 
were solved in a sequence with positive an 


negative instances varying one attribute 


from the focus card in the first two concepts. 


In the last two concepts to be attained, | 


negative instances were varied one attri- 
bute, but only one positive instance Ww 
used to attain the minimally sufficient st 
of information. The sequence was comprise 
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TABLE 2 

RELIABILITY ESTIMATES OF IDENTIFICATION VARIABLES FOR EIGHT HYPOTHESIZED FACTORS 
Factor Test Not Procedure Ttt 
Rote Memory Object-Number 15 Split-half .68 
First & Last Names 30 | Split-half| .77 
Verbal Vocabulary 36 Split-half .86 
Advanced Vocabulary 36 Split-half| .81 
Deduction Logical Reasoning 40 Split-half | .72 


Span Memory 


General Reasoning Ship Destination 


Nonsense Syllogisms 


Auditory Number Span 24 
Auditory Letter Span 


30 K. R. 20 .88 


K. R. 20 .62 
24 K. R. 20 .T6 


57 Split-half | .93 


Necessary Arithmetic Operations 15 Split-half 74 
Perceptual Speed Finding A’s 25 
Number Comparison 24 


Induction Locations 
Letter Sets 
Spatial Map Planning 


20 
Maze Tracing Speed Test 24 


28 | Split-half 
30 | Split-half 


Ri 


* Traditional computational techniques not appropriate. 


of concepts of two, three, two, and three 
relevant attributes. 

The information-processing task was ad- 
ministered in a large group-testing session 
after the individually administered con- 
cept-attainment session, Stimulus materials, 
or instances of concepts, were presented on 
3 X 3 slides. Each slide thus comprised an 
item and was subsequently scored as correct 
or incorrect. 


RESULTS 

Two scale-free models, Incomplete 
Image analysis (Harris, 1962), and 
Alpha Factor analysis (Kaiser & Caf- 
frey, 1963), which rescale the reduced 
correlation matrix in the metric of the 
unique and common parts respec- 
tively, were employed as data reduc- 
tion models. Only the Alpha model is 
reported in this discussion. 

A Normal Varimax rotation (Kai- 
ser, 1958) of the Incomplete Image 
factors. yielded 23 factors, eight of 
which were uninterpretable. The same 


orthogonal criterion applied to the 
Alpha factors yielded a derived matrix 
of 12 factors, all of which were inter- 
pretable. The 12 Alpha factors, when 
rotated to the Normal Varimax cri- 
terion, were found to include the fol- 
lowing factors (Roman numerals in 
Table 3) identified by the experimen- 
tal tests of cognitive abilities: (I) 
Verbal Comprehension, (IV) Rote 
Memory, (VI) Span Memory, (VII) 
Spatial Scanning, (X) Deduction, 
(XI) General Reasoning, and (XII) 
Induction. The Perceptual Speed fac- 
tor was not identified. Thus seven of 
the eight ability factors which the 16 
experimental tests were supposed to 
measure were in fact identified. 

Five factors identified by the load- 
ings of the task variables—three con- 
cept-attainment and two information- 
processing factors—were called (Table 
3): (II) Information Processing (In- 
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clusion-Exclusion), (III) Selection 
(Concept 2), (V) Presented Concept, 
(VIII) Selection (Concept 1), (IX) 
Information Processing (Indetermi- 
nate). 

To observe ability-task relation- 
ships, the Alpha factor matrix was 
rotated to an oblique solution using 
the Harris-Kaiser (1964) criterion. 
Since some variables were of com- 
plexity greater than one, the A’A pro- 
portional to L case was used. Table 
3 represents the 34 X 12 oblique factor 
matrix. 

Italic loadings of Table 3 provided 
the rationale for oblique factor de- 
scriptions. Only the Induction factor, 
educed from the Letter Sets and Loca- 
tions tests, presented some difficulty 
in identification. Thurstone’s (1940) 
isolation of this factor from the Letter 
Grouping test facilitated the factor 
identification. (See also Goodman, 
1943; Kettner, Guilford, & Christen- 
sen, 1959; Thurstone, 1941). The Spa- 
tial Scanning factor, previously iso- 
lated but unidentified by Thurstone 
and Thurstone (1941) and suggested 
by French et al. (1963) as represent- 
ing a “planning” function, is also of 
interest. The strong involvement of 
this factor with tests from the reason- 
ing domain suggests that there is, in 
fact, a strong convergent involvement 
in this type of activity. 

The L matrix, the matrix of factor 
intercorrelations for the 12 task and 
ability factors, is presented in Table 4. 
Of 21 correlations among seven ability 
factors (Table 4), seven were positive 
and ranged between .238 and .579; 
Verbal Comprehension, General Rea- 
soning, and Induction are involved in 
these seven correlations. Of 10 cor- 
relations among three concept-attain- 
ment and two information-processing 
tasks, five were positive and ranged 
between .213 and .431; the factor In- 
formation Processing—Indeterminate 
Inclusion, was involved in three of five 
positive correlations. Thus correla- 


TABLE 4 
MATRIX or INTERCORRELATIONS OF 12 ALPHA FACTORS 
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tions among variables within each of 
two sets of variables were of about the 
same magnitude and the proportion of 
the total was about the same. 

Of 35 correlations between the set 
of seven ability variables and the set 
of five task variables, 12 were posi- 
tive and ranged from .220 to .461. Of 
these 12 correlations, 10 involved 
General Reasoning and Induction; the 
other two, Verbal Comprehension. 
General Reasoning correlated posi- 
tively with all five of the task fac- 
tors, the range being from .363 to .461. 
Induction also correlated with all five 
task factors, the range of r's being 
.229 to .353. Thus General Reasoning, 
Induction, and, to a lesser extent, 
Verbal Comprehension correlated sub- 
stantially and consistently with con- 
cept attainment and information proc- 
essing; the other four abilities—Rote 
Memory, Span Memory, Spatial 
Seanning, and Deduction—did not. 
Further, Information Processing-In- 
definite Inelusion was the information- 
processing factor that correlated most 
consistently with other task factors 
and the cognitive factors, six of 11 r’s 
ranging between .213 and .431. Pre- 
sented Concept was the concept-at- 
tainment factor to correlate most con- 
sistently with task and cognitive 
ability factors, five of 11 r’s ranging 
between .213 and .461. 


Discussion 


Low correlations were found be- 
tween a set of cognitive abilities, a set 
of concept-attainment factors, and a 
set of information-processing factors. 
However, several abilities—Verbal 
Comprehension, General Reasoning, 
and Induction—correlated consist- 
ently with the concept-attainment 
and information-processing factors. 

The identification of three rela- 
tively distinct concept-attainment fac- 
tors and two information-processing 
faetors with only a few correlations of 


modest size among these factors was 
not anticipated. Consider the three 
concept-attainment factors. Two of | 
these—Selection Concept I and Selec- 
tion Concept II—resulted from meas- 
ures of Ss’ attaining two concepts in 
sequence, under identical experimental 
conditions; that is, in one sitting with 
identical instructions, materials, ete. 
Why did two separate factors result? 
Ss first attained a two-attribute and 
then a three-attribute concept or first 
a three-attribute and then a two-attri- 
bute concept. They were not informed 
of this attribute change which affected 
the difficulty level of the concept. Ap; 
parently, the change in the number 0| 
attributes and the ordinal position af- 
fected performance in such a manner 
as to result in separate factors. 

The third factor, Presented Concept, 
resulted from a very different experi- 
mental situation. Here the task was 
for Ss to attain four concepts of two 
and three attributes in sequence; how- 
ever, only the minimum amount of 
information necessary to attain the 
concept was presented. Thus, attain- 
ing the concept under these conditions 
was distinctly different from that in 
which a large array was presented 
simultaneously and Ss selected cards 
successively. 

Information processing as defined 
by eight subtests also yielded two fac- 
tors, one being based on items of the 
type where the card definitely was or 
was not a member of the concept, the 
other being based on items where in- 
sufficient information was presented 
to determine inclusion in the concept. 
Identification of the two factors sug- 
gests that information processing 1$ 
not a unitary ability, much the same 
as concept attainment is not. Further, 
information processing as defined in 
this study is not closely related 1o 
concept attainment. | 

The above considerations lead to ? | 
further question concerning the na- | 


> 
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ture of concept attainment and in- 
formation processing, using stimulus 
material of the type included in the 
present study. How unique are tasks 
that result, for example, in three con- 
cept-attainment and two informa- 
tion-processing factors? How many 
factors would be found if various 
sequences of one-, two-, three-, four-, 
five-, and six-attribute concepts were 
used? These questions remain unan- 
swered; however, the identification of 
the task factors suggests clearly that 
concept attainment and information 
processing are quite different, that 
these tasks appear to be quite different 
from the cognitive factors identified in 
this study. 
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EFFECTS OF SOCIAL CUES AND TASK COMPLEXITY 
IN CONCEPT IDENTIFICATION’ 


AARON WOLFGANG? 
Behavioral Science Laboratories, Veterans Administration Hospital, Oklahoma City 


The 120 Ss participated in a factorial design which included 3 levels of 
task complexity and 3 conditions where Ss served individually or with 
a partner. In the free interaction (FI) condition 2 Ss were free to 
communicate and in the restricted interaction (RI) condition com- 
munication between Ss was limited. Ss in the FI condition outper- 
formed individuals (only on the most complex concepts) and Ss in 
the RI condition across all levels of complexity. The assumption of 
the mathematical theory of CI that learning rate should decrease 
with increases in irrelevant information was confirmed in the in- 
dividual and RI conditions but not in the FI condition. 


Learning frequently occurs in 
groups where individuals exchange in- 
formation, yet the most influential 
learning theories, including the more 
recent mathematical models (e.g., 
Estes, 1963; Restle, 1955), have pos- 
tulates based solely on the behavior 
of the individual performing alone. 
Bourne and Restle (1959), in attempt- 
ing to develop a comprehensive mathe- 
matical theory of concept identifica- 
tion (CI), have not as yet considered 
social cues and their influence on 
learning. In dealing with the effects of 
a wide variety of variables on CI, in- 
vestigators have focused on the per- 
formance of individual subjects (Ss). 
Recently, Pishkin and Blanchard 
(1963) , in extending the mathematical 
theory of CI to include social parame- 
ters, have established the value of so- 
cial cues in a situation where a pro- 
grammed stooge provided social cues 
but where communication between Ss 
was prohibited. Research dealing with 
the social aspects of CI where Ss are 


1 This article is based on a doctoral dis- 
sertation submitted to the Graduate School 
of the University of Oklahoma. Special 
thanks to Vladimir Pishkin for valuable sug- 
gestions. This project was supported in 
part by the Veterans Administration Medi- 
cal Research Program (8200 Funds). 

3 Also at the University of Oklahoma 
Medical Center. 


free to discuss their hypotheses re- + 
garding the solution of problems vary- 
ing in amount of irrelevant informa- 
tion or complexity has received little 
or no attention. The purpose of the 
present study was to compare learning 
rates of two-man groups free to com- 
municate with that of single Ss and 
two-man groups where communica- 
tion was restricted in solving CI prob- 
lems of different complexities. 


METHOD 


Subjects 


The Ss were 120 male volunteer students 
in elementary psychology courses, who were 
randomly divided into nine treatment con- 
ditions. The Ss in the two-man groups who 
acknowledged that they were friends were 
excluded from participating with each other. 


Design 


A 3 X 3 factorial design was used, which 
included three levels of complexity (1, 3, 0T 
5 irrelevant bits of information) and three 
levels of interaction. The basic unit of meas- 
urable relevant and irrelevant information 
in information theory is a bit, which is de- 
fined as log sz where z is the number of 
equally probable stimulus events. The gen- 
eral rule is that every time the number of 
alternatives from which to make a decision 
is increased by a factor of 2, 1 bit of infor- 
mation is added. In the present experiment 
the stimuli to be classified were specified 
along different binary stimulus dimensions 
(e.g size: large or small; position: top OT 
bottom). There was always 1 bit of rele- 
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vant information accompanied by 1, 3, or 
5 irrelevant bits. Thus for problems with 1 
irrelevant bit there were 4 stimuli to cate- 
gorize; with 3 irrelevant bits, 16 possible 
stimuli; and with 5 irrelevant bits there 
were 64 stimuli from which to make a 
choice. 

In each of 24 two-person groups in the 
free interaction condition (FI), Ss were free 
to discuss their hypotheses or ideas with 
each other but had to reach a mutually 
agreeable decision on each trial before pro- 
ceeding to the next trial; in the restricted 
interaction condition (RI) no discussion was 
permitted between Ss in each of the 24 two- 
person groups, Ss simply stated and reg- 
istered their individual decisions on one of 
the two response keys on each trial; and 
lastly, 24 Ss performed in the individual 
condition (I). 


Task and Apparatus 


In solving a two-choice CI problem, $'s 
task was to categorize a series of geometric 
patterns flashed on the screen in accordance 
with a relevant binary stimulus dimension. 
For example, where the relevant dimension 
was form, S would press key A in response to 
a square, and key B in response to a tri- 
angle and be correct, If, however, S re- 
sponded to the irrelevant dimensions such 
as color (red or green) which has a zero 
correlation with the correct response, then 
his responses would be correct only at the 
chance level. The relevant dimensions of the 
two problem types were form and number. 

The standard CI apparatus was used 
(Pishkin, 1960) with several modifications 
—experimenter’s (E's) panel board and a 
semi-soundproof cubicle—described below. 

The E's panel board was electronically 
connected to S's panel and contained, in an 
identical manner, two response keys identi- 
fied by the letters A and B and two feed- 
back lights; but the panel boards differed in 
function. When S pressed a key E's panel 
light indicated S's choice of response; then 
E, using a planned program of feedback co- 
ordinated with the filmstrip programming, 
depressed a key which lit up one of the feed- 
back lights on S's panel for approximately 
1 second, indicating to S the correctness of 
his response. The Esterline Angus 20-pen 
Operations recorder was electronieally con- 
nected to both E's and S's panel boards to 
record S's response and H’s feedback. 

To reduce background noise from the ap- 
paratus and surroundings, a wooden cubicle 
lined with soundproof tiles was constructed. 
The three-paneled cubicle with a top and 


two sides was 63 inches high, 36 inches from 

front to back and 48 inches in width, It was 

roomy enough so that one or two Ss could 

be comfortably seated. The cubicle was ar- 

ranged so that Ss could clearly view only the 

een and panel board directly in front of 
em. 


Procedure 


At the start of each session, Ss in the 
social conditions were seated inside the 
soundproof cubicle in front of the screen, 
each having access to the response keys. The 
Ss were told about the nature of the task 
and the significance of the response keys and 
feedback lights. The instructions for indi- 
viduals and Ss in two-person groups were 
essentially the same except that Ss in the 
FI group were told that they were to ar- 
rive at a single decision and only one de- 
cision could be registered per trial by either 
S; whereas Ss in the noninteracting control 
group were simply told to state their in- 
dividual decisions verbally and then register 
them by depressing one of the two response 
keys on each trial. 

The criterion to solution for Ss in all con- 
ditions was 16 consecutive correct responses, 
In the RI condition, criterion to solution oc- 
curred when each S independently made 16 
consecutive correct responses. However, if 
criterion was not reached, then Ss were given 
a maximum of 192 trials. 


RzsuLTS 


Analysis of Error and Time Scores 


Due to the marked: heterogeneity 
(the ratio between the largest and 
smallest variance was over 100), a log 
transformation upon error and time 
scores was performed resulting in 
homogeneity of variance. An analysis 
of variance on log error scores dis- 
closed significant main effects for 
levels of interaction (F — 27.12, df 
= 2/63, p « .001) and complexity (F 
= 10.40, df = 2/63, p < .001). Mean 
numbers of errors were 29.75, 13.5, and 
2.29 in the RI, I, and FI conditions, 
respectively. Duncan’s multiple-range 
test revealed that Ss in the FI condi- 
tion were superior to those in the I (df 
= 63, p < .005) and RI conditions (df 
= 63, p < .001). The Ss in the I con- 
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dition made significantly fewer errors 
than those in the RI condition (df — 
63, p « .001). 

Figure 1 presents mean errors be- 
tween interaction conditions at each 
level of complexity. To compare per- 
formance differences between inter- 
action conditions at each level of 
complexity, t tests were computed. Al- 
though Ss in the FI condition made 
fewer errors than those in the I condi- 
tion along all levels of complexity, 
significance was reached only at the 
highest level of complexity (t = 3.70, 
df = 63, p < .001). In contrast, Ss in 
the FI condition significantly outper- 
formed those in the RI condition on 
problems with 1 (t = 3.33, df = 63), 3 
(t = 4.90, df = 63), and 5 irrelevant 
bits (¢ = 4.62, df = 63) with p < .01. 
Individuals made significantly fewer 
errors than Ss in the RI condition on 
problems with 1 (t = 2.85, df = 63, p 
< .01) and 8 irrelevant bits (t = 3.44, 
df = 68, p < .01) but on problems with 
5 irrelevant bits no significant differ- 
ences were found (t = .92, df = 63, p 
> .05). 

The significance of the complexity 
main effect showed that as the amount, 
of irrelevant information increased 
mean errors progressively increased. 
The mean number of errors for prob- 
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Fic. 1. Mean errors as a function of ir- 
relevant bits of information. (The parameter 
is levels of interaction. Each point repre- 
sents a mean based on an N of 8.) 


lems containing 1, 3, or 5 irrelevant 
bits of information were 4.75, 12.12, 
and 28.66, respectively. An orthogonal 
polynomial analysis was performed 
for complexity and only the linear 
component reached significance (Fin 
= 19.99, df = 1/63, p < .001). 

An analysis of variance of mean log 
time to solution in minutes revealed 
significant main effects for levels of 
interaction (F = 18.54, df = 2/63, p 
< .001) and task complexity (F = 
15.12, df = 2/63, p < .001). Subse- 
quent analysis of levels of interaction 
with Duncan’s multiple-range test in- 
dicated that there were insignificant 
differences in time to solution between 
Ss in the FI and I conditions (df = 63, 
p > .05). However, there were signifi- 
cant differences in performances be- 
tween Ss in the FI and RI condition 
(df = 63, p < .001), with the Ss in the 
RI condition taking longer to reach 
solution. Mean times to solution in 
minutes for Ss were: FI-6.42, RI- 
17.50, and I-8.16. The significant main 
effect of complexity indicated that dif- 
ferences among mean times to solu- 
tion were a function of the amount of 
irrelevant information. Mean times to 
solution in minutes for problems with 
1, 3, and 5 bits of irrelevant informa- 
tion were 5.58, 9.75, and 16.79, respet- 
tively. 


Discussion 


In the present study task complex- 
ity proved to be important in clarify- 
ing the generalization that cooperative 
groups are superior to individuals in 
learning. That FI groups outperformed 
individuals only on the most complex 
problems where information load, 
memory requirements, and number oi 
alternative hypotheses were greatest 
could be attributed to several factors. 
First, opportunity for perseverating 0n 
an incorrect hypothesis is reduced in 
the FI group, since an S would have t0 | 
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offer his partner some justification for 
his persistently incorrect responses 
whereas in the individual condition no 
such justification was necessary. Sec- 
ond, and probably more important, 
FI groups have more resources avail- 
able and ean draw on two memories 
for retaining and evaluating past in- 
formation. The importance of memory 
in concept learning has been demon- 
strated by Cahill and Hovland (1960), 
Hunt (1961), Bourne, Goldstein, and 
Link (1964), and Pishkin and Wolf- 
gang (1965). Recently, Laughlin 
(1965) presented evidence indicating 
that two-person cooperative groups 
showed better memory in concept at- 
tainment than individuals and made 
more efficient use of memory than in- 
dividuals. 

Judging from the relatively poor 
performance of the RI group, it is 
plausible that more irrelevant dis- 
tracting cues were involved. The CI 
task ean be described as one that re- 
quires sustained attention, concentra- 
tion, reliance on past information feed- 
back, and the ability to identify and 
classify relevant cues. The Ss in the 
RI condition might have had difficulty 
in concentrating on their own hypoth- 
eses and feedback when at the same 
time additional - information was 
Offered by their partners who were 
voieing and registering similar and 
contrasting responses. That learning 
was more retarded in the RI condition 
than in the individual condition was 
consistent with the results of a num- 
ber of animal and human studies re- 
ported by Zajone (1965). 

In view of the slight deterioration 
in performance on problems of in- 
creasing complexity by Ss in the FI 
condition, it seems reasonable to ex- 
pect that they would be able to solve 
problems of even greater complexity. 
At present the maximum number of ir- 
relevant, bits of information in two- 


choice CI problems has been five. A 
logical extension of the present study 
would be to explore the limits of in- 
dividual capacity to handle informa- 
tion of higher levels of complexity 
(e.g, 6 or 7 irrelevant bits) under 
social and individual conditions. 

One of the basic postulates of the 
mathematical theory of CI is that 
learning rate should decrease with in- 
creases in irrelevant information 
(Bourne & Restle, 1959, p. 288). How- 
ever, that assumption was confirmed 
in the I and RI conditions but not in 
the FI condition where Ss showed no 
decrease in learning rate from prob- 
lems with 1 to 3 irrelevant bits. That 
learning was not progressively re- 
tarded with increased task complexity 
in the FI condition was inconsistent 
with the results of past investigators 
using individual Ss (e.g, Archer, 
Bourne, & Brown, 1955; Bourne, 1957; 
Bourne & Haygood, 1959; Pishkin, 
1960; Wolfgang, Pishkin, & Lundy, 
1962). It was not possible to use the 
extended theory of CI (Pishkin & 
Blanchard, 1963) to make error pre- 
dictions or establish the value of social 
cues, since the equations apply only 
to the influence of programmed social 
cues, In conclusion, the mathematical 
theory of CI that bases its assumption 
on the performance of the individual 
working alone needs to be qualified or 
revised for pairs of learners free to 
exchange information. 
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MEANINGFULNESS IN VERBAL LEARNING' 
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Ss, 932 students of the Detroit Public School System, were used in a 
8-factor experimental design to investigate the interactions among 
5 grade levels, 2 modes of presentation (auditory vs. visual), and the 
learning of verbal materials at 2 levels of meaningfulness. A recall 
paired-associate paradigm was employed. Within the context of a 
single experimental program, heretofore unavailable, the presumed 
interaction of Modality X Grade Level and Meaningfulness has been 
confirmed. A significant interaction of Grade Level Modality was 
found, although not in the expected direction. It is suggested that 
there is no inherently preferred modality for materials which are 
highly meaningful, that any such preference is a function of habit 


patterns. 


A unique situation arises in the 
realm of verbal learning. A set of ma- 
terials is available that can be readily 
converted from an auditory to a visual 
presentation and vice versa. With a 
great deal of generality, it may be said 
that an effect attributable to modal- 
ity of presentation is reasonable only 
in a context in which alphabet mate- 
rials are used. It is very difficult to 
create other sets of materials which 
could be called equivalent across mo- 
dalities. Historically, questions con- 
cerned with the relative efficacy of 
auditory versus visual materials have 
produced answers which point to an 
interaction of modality with other 
factors. Day and Beach (1950, p. 8) 
reviewed modality considerations in 
verbal learning. As they point out, a 
limitation of the review was the in- 
ability to make direct comparisons 
among the studies available; task re- 
quirements, materials, and paradigms 
varied to a considerable extent. Cer- 
tain conclusions were reached: (a) 
As the intelligence level and reading 


1 The present study is based, in part, on a 
portion of Cooperative Research Project 
#1001, Office of Education, United States 
Department of Health, Education, and 

elfare. 
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level of the subject increases, the 
visual modality becomes more effi- 
cient. (b) “The relative efficiency of a 
visual presentation increases with age 
from a definite inferiority at the age 
of six to a possible superiority at the 
age of sixteen." (c) Increasing diffi- 
culty of material favors a visual pres- 
entation but “particularly easy mate- 
rial is better understood with an 
auditory presentation.” 

Recently, Williams and  Derks 
(1963) concluded that modality of 
presentation produced a significant 
interaction with materials. The sub- 
jects were college students engaged in 
a paired-associate task. Keppel ( 1964) 
reviewed verbal learning in children 
with the aim of structuring experi- 
mental findings within current learn- 
ing theory. Little information was un- 
covered with regard to the effects of 
modality of presentation. Gaeth (1960, 
1963) has approached the problem of 
modality effects. The direct concern 
was the comparison and evaluation of 
differences in performance between 
normal and hearing-handicapped pop- 
ulations. In the first report, only mo- 
dality of presentation and grade level 
were varied. No pattern of modality 
effects was evident, although modality 


42 


interacted significantly with grade 
level. In the second study, the ranges 
of grade level and type of material 
were expanded. A portion of that data 
provides the uniform testing condi- 
tions for a direct confirmation of ef- 
fects of modality over a range of 
maturational levels. 


METHOD 


Design 


A three-factor experimental design was 
employed. Five grade levels (4, 5, 6, 10, 
and 12) were represented. Two levels of 
meaningfulness (consonant-vowel-consonant 
trigrams and three-letter nouns) and two 
types of presentation (visual and auditory) 
were used. 


Subjects 


The subjects were 932 students from the 
fourth, fifth, sixth, tenth, and twelfth grades 
in the Detroit Public School System. These 
were randomly selected from within schools 
which in turn were selected, with the aid of 
the system's administration, as a typical 
cross-section of middle-class urban schools. 


Apparatus 


A slide-projector presented visual mate- 
rial, and a dual-channel tape recorder, audi- 
tory material. The second channel of a tape 
contained signals which activated the slide- 
changing mechanism of the projector. 


Materials 


Table 1 presents the specifie items used 
as stimuli and responses at each of the two 
levels of meaningfulness. In addition, 


TABLE 1 


STIMULUS AND RESPONSE PAIRS FOR 
Eacu TYPE or MATERIAL 
C a a 


cvc Stimuli 
EE Nouns ‘Responses 

WUB 21 CAT H 
KEZ 29 ICE M 
JID 35 ARM v 
DAQ 38 RUG L 
WOJ 13 SKY x 
ZEG 25 GUN F 

x 26.8 
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Archer’s association values for the CVC 
nonsense syllables are listed beside the spe- 
cific nonsense syllables, as is the mean for 


the list. It will be noted that the set of re- ` 


sponse items is the same for the two levels 
of stimulus meaningfulness. 


Procedure 


The recall modification of a paired-asso- 
ciate paradigm was used. On a visual learn- 


ing trial, the stimulus was presented for 2.5 ' 


seconds followed by the response for equal 
duration. The interpair interval was 55 
seconds. The intertrial interval was approx- 
imately 15 seconds. A test trial followed the 
same procedure except for the omission of 
the response. The subjects were instructed 
to write the appropriate response, if they 
had learned it, during the 5.5-second inter- 
stimulus interval of a test trial. A learning 
trial was given first, followed by a test trial, 
another learning trial, and so on until 10 
learning and 10 test trials had been com- 
pleted. The order of each trial was ran- 
domized. In the auditory presentation a 
single pronunciation of a stimulus or a re- 
sponse was made in lieu of a visual pro- 
jection, Timing remained the same. Testing 
was done in classroom sections. 


ResuLTS 


The mean correct performance for 
10 test trials of the 20 conditions can 
be seen in Figure 1. A three-way 
analysis of variance was computed 
using Winer’s (1962) model for un- 
equal cell frequency based on un- 
weighted means. The three main ef- 
fects were significant: grades, F (4, 
912) = 90.08, p < .01; materials, F 
(1, 912) = 213.52, p < .01; modal- 
ities, F (1, 912) = 10.94, p < .01. Two 
interactions were significant, Modal- 
ity X Material, F (1, 912) = 4.12, p 
< .05; Grade x Modality, F (4, 912) 
= 3.35, p < .05. The remaining two- 
way interaction, Grade x Material, 
and the three-way interaction were not 
significant. The significant main ef- 
fects expected for grade level and type 
of material were found. Of particular 
interest is the significant main effect 
of modality as well as the significant 
interaction of Modality x Materials 
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Fic. 1. Mean performance for 10 trials plotted against grade level for each 
modality of presentation and type of material. 


and Grade Level, respectively. The 
significant interactions may be stud- 
ied graphically in Figure 1. As can be 
seen, performance with nouns is supe- 
rior with a visual presentation in the 
fourth grade, while an auditory pres- 
entation is superior in the twelfth 
grade, For trigrams, a visual presen- 
tation is consistently more efficient. 


Discussion 


To be sure, the majority of the ef- 
fects which can be inferred from the 
analysis of variance are completely in 
line with previous research. The main 
effects of grade level, meaningfulness 
of material, and modality fall into 
this category, although the effect of 
modality is not as clearly established 
in the literature as are the other two 
parameters. The three-way interac- 
tion is not significant, indicating that 


we are not dealing with an overly 
complex situation. Nor would an in- 
teraction of Grade X Meaningfulness 
be expected. The interaction of Mean- 
ingfulness X Modality has not been 
previously examined in the context of 
a single experiment. However, its sta- 
tistical significance, as found in this 
examination, would be expected from 
the review made by Day and Beach. 
In examining Figure 1, it is clear that 
the more difficult CVC material lends 
itself to visual presentation across 
grades, which is the converse of the 
statement that “particularly easy ma- 
terial” is favored by an auditory pres- 
entation. It will be noted that no hint 
of the Grade x Modality interaction 
can be seen with this type of material. 
The noun materials demonstrate no 
such uniformly more effective modal- 


ity. 
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The one unexpected finding, which 
forms the crux of this report, is the 
source of the interaction between 
Grade Level x Modality, and not its 
statistical significance. Day and 
Beach’s summary provides ample evi- 
dence that such an interaction may be 
expected to be significant. From their 
review, “The relative efficiency of a 
visual presentation increases with age 
from a definite inferiority at the age 
of six to a possible superiority at the 
age of sixteen.” Recall that “particu- 
larly easy material” is favored by an 
auditory presentation. In fact, the 
generalizations are not substantiated. 
We find that visually presented noun 
materials provide superior perform- 
ance in the fourth, fifth, and sixth 
grades while producing inferior results 
in the tenth and twelfth grades. This 
shift of more efficient modality is sig- 
nificant, as interpreted by the Grade 
X Modality interaction and, in addi- 
tion, is in a direction opposite to that 
suggested by Day and Beach. 

The basis for Day and Beach’s pre- 
diction would seem to be their depend- 
ence upon two determining factors, the 
reading level of the subject and the 
nature of the materials. In the first 
case it is presumed that the greater 
degree of proficiency in and experience 
with oral communication would in- 
duce in the younger child a disposition 
toward oral communication which 
would lessen as reading ability and 
experience increased. In the second 
case the advantages of a visual pres- 
entation, its stability in time, its 
discreteness, and its inherent numeri- 
cal advantage in terms of manipulat- 
able parameters (versus the number 
available in an auditory presenta- 
tion), lend themselves to communica- 
tion of less familiar or more difficult 
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material It is felt that insufficient 
emphasis has been placed on a third 
factor, that at some point in language 
development there ceases to be a func- 
tional distinction between modalities, 
The least that can be said for lan- 
guage is that it must provide for func- 
tional equivalence between modalities 
or it fails in its purpose. The param- 
eters which govern the effectiveness of 
one modality over the other have no 
meaning when applied to highly fa- 
miliar material. Any differences be- 
tween modalities would seem to be a 
function of tendencies engendered by 
habitual manners of usage as opposed 
to inherent qualities of materials or 
subject variables such as reading 
level. 
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EXPOSITORY INSTRUCTION VERSUS A 
DISCOVERY METHOD’ 


JOHN T. GUTHRIE 
University of Illinois 


To test the proposition that discovery learning facilitates retention 
and transfer, 72 college seniors were taught to decipher cryptograms 
with 4 instructional sequences: Rule-Example, Example-Rule, and 
Example. A No-Training control group was included. On the retention 
test, the Rule-Example group was superior to all groups and the other 
groups did not differ. On the remote transfer test, the Example group 
was superior to all others, the Example-Rule and No Training did not 
differ, and the Rule-Example group was inferior to the others. All 
differences were significant at the 05 level. The discovery method 
appears to facilitate transfer, but not retention; expository instruction 
facilitates retention, but impedes remote transfer. 


It has been argued that the dis- 
covery method facilitates the reten- 
tion of subject matter (Ausubel, 1963; 
Bruner, 1961). However, empirical 
research fails to sustain these opin- 
ions. When speed of learning and re- 
tention were used as criteria, instruc- 
tion containing rules has proved 
superior to instruction without rules 
(Craig, 1956; Haselrud & Meyers, 
1958; Kittell, 1957; Wittrick, 1963). 
In addition, the only two studies 
(Hendrix, 1961; Katona, 1940) which 
oppose this generalization are difficult 
to interpret because the control and 
experimental groups were incompara- 
ble. 

A more popular stand is that the 
discovery method is primarily useful 
for producing transfer and general 
problem-solving ability (Bruner, 1961; 
Suchman, 1961). However, when 
transfer is used as the criterion for 
learning, the results have been equivo- 
cal. Craig (1956) found no difference 
between “independent” and “directed” 
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groups on transfer to new rules. Hasel- 
rud and Meyers (1958) confirmed this 
finding, although this conclusion is 
suspect due to uncontrolled practice 
effects resulting from a within-sub- 
jects design. An advantage on trans- 
fer for instruction with “rule given” 
as compared with “rule not given” is 
reported by Wittrock (1963). How- 
ever, the present author’s examina- 
tion of the rule group and the no-rule 
group with feedback present for both, 
indicates that there was an advantage 
(t = 2.50, df = 145, p < .05) for the 
no-rule group. Gagné and Brown 
(1961) also suggest that a “guided dis- 
covery” training procedure is superior 
to a “rule and example” procedure for 
producing transfer. Their treatment 
conditions, however, appear to have 
favored the “guided discovery” group. 
This group was trained on the specific 
type of behavior required by the cri- 
terion, whereas the “rule and example” 
group was taught a different type of 
behavior. Finally, Kittell (1957) con- 
cluded that giving a rule at the outset. 
of instruction was superior to not giv- 
ing a rule on all criteria: speed of 
learning, retention, and transfer. His 
conclusion, however, is unjustified 
since a substantial number of the sub- 
jects (Ss) taught by the discovery 
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method failed to meet the initial 
learning criteria. It is doubtful that 
Ss would be able to transfer behavior 
which was never learned. 

The above research clearly demon- 
strates that instruction with rules pro- 
duces faster learning and better re- 
tention than instruction without rules. 
On the other hand, the empirical evi- 
dence concerning techniques for pro- 
ducing transfer is inconclusive. Conse- 
quently, the more central proposition 
that discovery learning facilitates 
transfer remains virtually untested. 
The purpose of the present experiment 
was to compare several methods of 
instruction including both rules and 
examples and only examples on the 
basis of several criteria of retention 
and transfer. It was hypothesized that 
instruction with rules and examples 
would facilitate retention, but not 
transfer. And training with only ex- 
amples would facilitate transfer, but 
not retention. 


METHOD 


The Ss were 72 students enrolled in 
an undergraduate educational psychology 
course, All Ss were taught to decipher 
eryptograms. The cryptograms were taken 
from words 4-10 letters in length occurring 
20-30 times per million in adult reading 
matter (Thorndike & Lorge, 1944). The 
letters were scrambled according to one of 
six rules. Rules 1, 2, and 3 were transposi- 
tional; 4, 5, and 6 were substitutional. Rule 
1: Exchange the first and the last letters in 
each cryptogram. Rule 2: Reverse the order 
of the first half and the last half of the 
letters. Rule 3: Remove the first 2 letters 
and place them in the middle of the rest of 
the cryptogram. Rule 4: Replace the first 
letter with the letter succeeding it in the 
alphabet. Rule 5: Replace the numbers with 
the correct vowel (a, e, i, o, u equal 1, 2, 
3, 4, 5). Rule 6: Replace the last letter with 
the letter preceding it in the alphabet. 

There were four treatment conditions to 
which Ss were assigned at random. In the 
Example-Rule group, examples of crypto- 
grams were presented until a criterion of 
eight consecutive correct responses was at- 
tained. The rule was then taught with a 


programming technique until the Ss could 
verbalize it upon request. Under the Rule- 
Example condition, the Ss were first taught 
the rule through the program. Examples of 
cryptograms governed by the rule were then 
given until the criterion was met. The Ex- 
ample condition consisted of presenting only 
examples of cryptograms until the criterion 
was attained. The control group received 
no training on deciphering cryptograms, 
but spent a comparable amount of time 
learning Russian vocabulary. All groups 
were subdivided. Half the Ss in each group 
were taught two transpositional rules (1 
and 2); and half were taught two substitu- 
tional rules (4 and 5). 

The test was composed of 30 crypto- 
grams, and was subdivided into 3 parts. The 
first part contained 10 cryptograms formed 
from words not seen in training and gov- 
erned by rules not used in training. The 
rules for this part were drawn from the 
opposite class of rules used in training. For 
instance, if S was trained to solve crypto- 
grams composed from transpositional rules, 
this part required him to solve cryptograms 
composed from substitutional rules, and vice 
versa. This was considered the “remote” 
transfer task. The second part contained 10 
cryptograms formed from words not seen in 
training, but governed by exactly the same 
rules as used in training. This was considered 
the retention task. In the third part, the 10 
eryptograms were formed from new words 
and governed by rules not used in training. 
However, these rules were drawn from the 
same class of rules (transpositional vs. sub- 
stitutional) as used in training. This was 
labeled the “near” transfer task. 

The Ss were seated at a table facing an 
upright divider (2% X 5 feet) with a small 
centered window (3 X 5 inches) in which 
the stimuli were presented. Written instruc- 
tions were first given to each S on a sheet 
of paper. The instructions, which were 
minimal, directed the S to say the word 
suggested by each cryptogram and informed 
him that he would be tested. The instruc- 
tions for the control group pertained only 
to the Russian vocabulary. 

The Ss received successive presentation 
of cryptograms conforming to one of the 
tules. Each S was presented cryptograms 
until a criterion of 8 consecutive correct re- 
sponses was attained or until 25 cryptograms 
had been presented. The order of presen- 
tation was randomized for each S. Each 
trial consisted of presenting a cryptogram 
and requiring S to verbalize the correct 
word. The length of the trials was 15 sec- 
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onds. Feedback, which consisted of present- 
ing the cryptogram with the correct word 
beside it, followed each trial. The duration 
of feedback was 5 seconds, The time lapse 
between feedback and the following trial 
was about 5 seconds. Each S was taught to 
decipher cryptograms governed by two 
rules. After the two rules were learned, the 
test was given. Directions for the test in- 
formed the S of the test and stated that 
some of the correct answers were obtainable 
by substituting one letter for another letter 
in the alphabet, and other answers were ob- 
tainable by rearranging the letters within 
the cryptogram. The S was informed when 
he entered each of the test subdivisions. 
The approximate time required for the test 


was 10 minutes. 


RESULTS 


The data were initially analyzed 
with a series of orthogonal planned 
comparisons, but the planned com- 
parisons appeared too general to ade- 
quately represent the data. Conse- 
quently, simple one-way analyses of 
variance were computed for each of 
the test, subdivisions including reten- 
tion, near transfer, and remote trans- 
fer and for the trials to criterion. The 
means, standard deviations, and F 
ratios are presented in Table 1. 

The overall significant differences 
permitted the use of the Tukey proce- 
dure (Edwards, 1963) for comparing 


means (see Table 2). First, on the re- 
mote-transfer task, the Example 
group was significantly superior to all 
other groups. The  Example-Rule 
group did not differ from No Training. 
And the Rule-Example group was sig- 
nificantly inferior to all other groups. 
On the near-transfer task, the Exam- 
ple group and the Example-Rule 
group were superior to No Training 
and Rule-Example, but did not differ 
from each other. The Rule-Example 
group was not different from the con- 
trols. 

On the retention task, there do not 
appear to be any differences among 
the treatment groups, though all are 
superior to no training. However, for 
reasons outlined in the discussion, it 
is believed that the retention test for 
those trained on substitutional rules 
was incapable of discriminating 
among treatments. Consequently, 
analyses of variance were conducted 
separately for Ss trained on substitu- 
tional rules and Ss trained on trans- 
formational rules. Significant differ- 
ences were found on both groups (see 
Table 1). The Tukey procedure was 
then employed to compare the means 
(see Table 2). As expected, there were 


TABLE 1 
Mans AND STANDARD DEVIATIONS OF ERRORS AND TRIALS TO CRITERION 
Rule-example Example-rule Example No training Y 
M SD M SD M sp | M | SD 
1.15 | 6.64** 
Remote transfer 5.94 | 1.59 | 4.26 | 2.20 | 2.56 | 2.73 4.07 EN 
Near transfer 5.23 | 2.26 | 2.84 | 2.00 | 2.27 | 1.05 s] L3 PER 
Retention 1.94 | 2.37 | 3.06 | 2.31 | 2.89 | 2.42 | 5 .97 | T. 
Retenti Substitu- 
tional) E 1.29| 1,70 | 1.56 | 1.06 | 1.11 | 0.29 | 4.44 | 2.64 | 7.18** 
R i - 
pisi Reser 2.55 | 1.37 | 4.56 | 2.26 | 4.67 | 2.25 | 6-11 | 1.41 aiio 
Trials to criterion 11.92 | 6.96 | 23.66 | 8.51 | 22.72 | 8.81 | — 5 
ion represent the total trials 


Note.—All test data are based on errors. Trials to criteri 


required to learn two rules. 


* p < 025. 
** p « 01. 
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TABLE 2 
TUKEY COMPARISON OF MEANS 
Remote transfer | Near transfer | Retention Ratna [ntes secet d Mast d 

1. Example Example | Rule-Example| | Example Rule-Example | Rule-Example 
2. Example-Rule Example-Rule| | Example Rule-Example | | No training Example 

3. No training No training Example-Rule| | Example-Rule | | Example-Rule | | Example-Rule 
4. Rule-Example Rule-Example | | No training No training Example — 


Note,—Treatments are in rank order. The treatments connected by a common line do not differ. All other treat- 


ments differ significantly at the .05 level. 


no differences among groups trained 
on the substitutional rule, though the 
control group was inferior to all treat- 
ment groups. However, for groups 
trained with transformational rules 
the Rule-Example was superior to all 
other groups. In addition, the Exam- 
ple and Example-Rule groups did not 
differ from the No-Training group. 
To analyze the time required for 
learning under various conditions, a 
one-way analysis of variance was 
conducted on all treatment groups us- 
ing trials to criterion as the dependent 
variable. First, the general differences 
were signifieant. In addition, the 
Tukey comparison showed that the 
Rule-Example group learned faster 
than the other two groups; and that 
the Example and  Example-Rule 
groups did not differ from each other. 


Discussion 


At the outset of the experiment, it 
was hypothesized that the rules would 
function as eliciting stimuli. In other 
words, it is suggested that rules in 
verbal learning are analogous to the 
unconditioned stimuli in classical con- 
ditioning. They serve to evoke a re- 
sponse which is then reinforced. For 
example, when the rule was presented 
in the Rule-Example group and a 
cryptogram was presented in close 
succession, specific deciphering be- 
havior was evoked from Ss. When 
these Ss were presented a cryptogram, 


they manipulated the letters specifi- 
cally according to the rule to form the 
word. It was speculated that this be- 
havior would then be reinforced 
through feedback and acquired. That 
such behavior was, in fact, acquired 
is evident from the results indicating 
that Ss in the Rule-Example group 
performed well on retention tasks. 

It was also reasoned at the outset 
that when no rules were given to Ss, 
their behavior would more closely ap- 
proximate the operant conditioning 
paradigm. That is, presenting Ss a 
eryptogram would evoke searching, 
exploratory behavior which would be 
reinforced when a word was made 
from the cryptogram. That such ex- 
plorative skill was acquired is evi- 
denced from the result that the Ex- 
ample and  Example-Rule groups 
surpassed the Rule-Example and the 
Control groups on the transfer, but 
not on the retention tasks. 

It should be noted that the conclu- 
sions regarding retention have been 
drawn from the data of Ss taught 
transformational rules; and the data 
from Ss taught substitutional rules 
have been ignored. The reason for 
this is that one of the substitutional 
rules contained a number, whereas all 
other rules contained only letters. It 
is likely that this number acted as à 
distinctive cue which facilitated re- 
tention in all groups. Thus a ceiling 
effect for retention of Ss taught sub- 
stitutional rules prevented differences 
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on this measure from manifesting 
themselves. 
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EFFECT OF CURIOSITY ON INCIDENTAL LEARNING 


WILLIAM PARADOWSKI 
Teachers College, Columbia University* 


Curiosity was aroused in an S by the presentation of 5 illustrations of 
strange-looking animals. By contrast, to determine the effect of low 
arousal, 5 illustrations of familiar animals were shown to the same S. 
Paragraphs of verbal information were paired with each illustration and 
provided an intentional learning task. 52 undergraduate Ss were used. 
Intentional learning was assessed for each item as was incidental learn- 
ing, which consisted of the posttest recall of the settings and border 
colors around each animal illustration. Curiosity arousal—stimulated 
by the pictures of novel animals—significantly increased both inten- 
tional learning and incidental learning. This finding is at sharp variance 
with the generalization based on research on aversive drive states that 


drive arousal leads to reduced incidental learning. 


A recently proposed generalization 
maintains that drive arousal acts con- 
sistently to reduce the range of cue 
utilization. According to Easterbrook 
(1959), the range of cue utilization is 
said to have fallen when the amount 
of incidental learning has been reduced 
although task learning has remained 
constant or been improved. A growing 
number of experiments have demon- 
strated this principle using a variety 
of drive states—for example, hunger 

. (Bruner, Matter, & Papanek, 1955), 
thirst (Johnson, 1952), and anxiety 
(Silverman, 1954)—of which all can 
be classified as aversive in nature. In 
proposing his generalization, Easter- 
brook considered drive as of aversive 
character, and defined it as an innate 
response to a state of biologic depriva- 
tion or noxious stimulation. 

Research on the tendency for 
aroused aversive drives to reduce in- 
cidental learning has been moderately 
consistent in its positive findings. No 


1 Adapted from a dissertation submitted 
to Teachers College, Columbia University 
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the doctoral degree in clinical psychology. 
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investigation, however, has so far been 
undertaken to see if a fundamentally 
nonaversive drive would show the same 
tendency—as, for example, curiosity, 
a drive characterized by positive affect 
and approach behavior. There seems 
little reason to expect that with regard 
to range of cue use an organism aroused 
by curiosity would necessarily func- 
tion in the same way as when aroused 
by an aversive drive. 

For at least the past decade a num- 
ber of theorists have emphasized the 
importance of recognizing a funda- 
mentally different class of drives or 
motives than those classed together 
as aversive or as dependent on the 
homeostatic regulation of organic 
need. Woodworth (1958), for example, 
proposed that there are two independ- 
ent classes of motives: need-primacy 
(which refers to motives primarily 
directed to the satisfaction of the in- 
ternal needs of the organism) and be- 
havior-primacy (which refers to mo- 
tives directly concerned with the 
interaction between the organism and 
environment). White (1959) has sup- 
ported a similar view. Hebb (1955) 
also has made a restatement concern- 
ing different effects of drive on learn- 
ing and motivation. Referring specif- 
ically to curiosity, some educators 
have conceived of it as a condition 
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whereby an individual becomes more 
sensitive to cues in his environment 
(Maw & Maw, 1962). 

Therefore, it is hypothesized, first, 
that arousal by curiosity has an effect 
on incidental learning different from 
that evoked by low arousal. Second, 
evidence is sought that would support 
a proposition that (a) curiosity 
arousal increases incidental learning 
or, alternatively, that (b) curiosity 
arousal decreases incidental learning. 


Mxzrngop 


The experimental design followed the 
model of incidental learning studies in 
which intentional and incidental tasks are 
so incorporated into one learning situation 
that the effects of a high- and low-arousal 
condition may be compared for each task. 
A booklet of written and illustrated material 
was prepared to conform to this design. 
Pictures of animals were paired with para- 
graphs of information about the animals so 
that the written matter provided an inten- 
tional task and details in the illustrated 
matter provided an incidental task. 

Since odd or strange-looking animals are 
a frequent source of curiosity, it was 
Planned that pictures of such animals 
would be used—in contrast to pictures of 
quite familiar animals—to achieve condi- 
tions of high and minimal curiosity, The 
Pictures of the animals used in the experi- 
ment proper were chosen on the basis of an 
auxiliary study. Twenty-four animal illus- 
trations were prepared by the experimenter, 
12 of which had been selected for their 
novelty and 12 for their ordinary appear- 
ance. Twenty subjects, (Ss), employed in 
the auxiliary group, were asked to rank the 
pictures of the animals from most interest- 
Ing to least interesting. From the results, 
five animals consistently ranked as most 
Interesting were selected as the experiment’s 
novel” animals, and five animals consist- 
ently ranked as least interesting were 
chosen to be the “common” animals. Of the 
novel animals, the one with the highest 
tank position for interest was an imaginary 
creation with a mean rank position of 4.55. 

he other four novel animals represented 
actual species: a pichiciago (5.25); a mega- 
therium (545); an elephant shrew (6.35); 
and a hemigale (680). The common ani- 
mals, with their mean rank positions, in- 
cluded: a donkey (18.60); a deer (18.95); a 
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dog (19.80); a sheep (20.00); and a musk- 
rat (20.60), 

Since the stimulus for curiosity was to be 
an animal illustration, it seemed practical 
to provide a background to the animal that 
could serve as a source of incidental learn- 
ing material. With this in mind, 10 easily 
identifiable environmental settings were 
drawn. These comprised five types of en- 
vironment with each type represented by 
iwo different drawings. The five types in- 
cluded a desert, forest, jungle, swamp, and 
one of mountainous terrain. Recall of a set- 
ting represented an item of incidental learn- 
ing. In addition, 1-inch colored borders were 
placed around each picture, five colors being 
used, each repeated twice. Color recall pro- 
vided a second kind of incidental learning 
item. 

A valid comparison of the incidental 
learning achieved for the novel animals as 
opposed to the common animals hinged on 
the comparability of the incidental cues 
paired with each set of animals. Therefore, 
in order to equate the incidental learning 
task for the two conditions, two sets of 
pictures were prepared. In one set the five 
novel and five common animals were placed 
in particular environmental settings; in the 
second set the relation between animal and 
background was reversed. For example, if a 
novel animal appeared in a forest in Set A, 
then a common animal appeared in the 
same forest in Set B. This reversal procedure 
applied to colored borders as well as to 
environmental settings. Two groups of com- 
parable Ss were used, one for each set of 
pictures. In this way, incidental learning 
could be assessed for all the settings and 
all colored borders—once when paired with 
a common animal and once when paired 
with a novel animal. Particular effort was 
made to avoid placing animals in environ- 
mental settings that might produce an in- 
congruous effect. 

The complete experimental instrument 
consisted of the described pictures presented 
together with written "factual" information 
about each animal. The material was pre- 
sented in a large book, and arranged so that 
an illustration with a colored border ap- 
peared on the left-hand page and written 
matter on the right-hand page. The instru- 
ment was prepared in this way to resemble 
material that could conceivably have come 
from a textbook on vertebrate paleontology. 

Two such books of material were pre- 
pared. The second book was identical in all 
respects to the first except that wherever 
a novel animal appeared in the one set of 
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illustrations, a common animal appeared in 
the other. 

'The Ss were told that they were to assist 
in the evaluation of a forthcoming general 
text for undergraduates by taking short 
tests on each of the presented items to 
assess the ease with which the material 
could be learned. This rationale provided 
the basis for the central task of the experi- 
ment. After looking at a picture of an ani- 
mal and then reading a paragraph of “fac- 
tual" material supposedly about the pictured 
animal, each S was given a completion test 
on the paragraph read. As far as Ss were 
concerned, their main purpose in studying 
the presented material was to do the com- 
pletion tests. The content of the paragraphs 
was concerned with information of such a 
trivial nature that—to a naive reader—it 
could have easily applied to any animal. 
None of the presented statements of infor- 
mation conflicted with known facts about 
familiar animals. Since two sets of pictures 
had been prepared, each paragraph was 
also used twice, once paired with a novel 
animal and once paired with a common ani- 
mal. As an example, one of the paragraphs 
used in the study follows: 

Fossil remains of this animal indicate 
an age of 36 million years. It has been 
determined to be of Miocene origin 
and is generally considered as a mar- 
ginal species between the Tertiary and 
Quaternary periods. The only fossils of 
this species have come from central 
Africa. 'The feet of this species have re- 
mained essentially unchanged from its 
primitive predecessors as well as its 
phalangeal formula. The teeth too are 
essentially unchanged with a dental 
formula of five upper and four lower 
incisors, a canine, three premolars, and 
four molars. This species shows a rela- 
tively fast postnatal development 
whereby the young are independent of 
their mothers and capable of self-care at 
an early age. 

A second auxiliary study was done to 
prepare a completion test on the “factual” 
material. It was desired that the completion 
test give a reliable measure of what the Ss 
had learned from the central task material 
in order to determine whether or not curios- 
ity arousal could also affect the retention of 
that material for which a direct learning set 
was given. The paragraphs of information 
were reproduced with 12 word deletions 
and were presented to 20 Ss with the com- 
plete paragraphs in alternating order. The 
Ss were permitted to read each paragraph 
once and then were given the appropriate 


completion test. Based on the results of this 
pilot study, the final completion test was 
composed of six deletions. Completion items 
were chosen that were answered correctly 
by approximately 50% of the auxiliary 
group. Thus, items that were too easy or 
too difficult were eliminated. 


Procedure 


Two groups of 26 undergraduates were 
used as Ss. They were oriented to the task 
by being asked to assist in evaluating mate- 
rial from a fortheoming textbook. The book 
of material was then opened by the experi- 
menter who also controlled the presentation 
of each test item. A practice item was first 
shown to Ss which was identical in format to 
the succeeding 10 experimental items. While 
the written paragraph was covered, the illus- 
trated material was exposed for 10 seconds. 
At the end of this time the written text 
was exposed for 20 seconds while the illus- 
tration was covered. The page was then 
turned, thus terminating exposure to the 
test material. A completion test for the par- 
agraph read was immediately presented to 
the Ss, Following the completion items, à 
multiple-choice test was administered which 
was concerned solely with the appearance of 


the animal. (Such a test was necessary as 8 


rationale for having the Ss look at the 
pictures. The scores, however, were not 
tabulated or in any way used in the final 
results.) Upon completing the two tests, 5$ 
were presented with the next test item 1n 
its order. After all the items were done, 5$ 
were given an incidental learning test which 
consisted of 10 pages, each page containing 
a reduced illustration of one of the experi 
mental animals. Beneath each illustration 
were two multiple-choice questions. The first 
question called for the selection of the actual 
background of the depicted animal; the 
second required selection of the color use 
for the border of that animal’s picture. 


ReEsuuts 


In addition to the auxiliary study 
ranking the animals for interest, # 
separate auxiliary study was done t0 
demonstrate the preference of Ss fo! 
looking at pictures of the novel ani- 
mals as opposed to common animals 
Only 10 Ss were used but the results 
were clear. The procedure involved. 
presenting the complete test material, 
as deseribed, to Ss. Both the illustrated 
and written material, however, wei 


, 
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simultaneously exposed. The Ss were 
therefore free to either glance at the il- 
lustration or read the accompanying 
paragraph, as they wished. The amount 
of time Ss spent looking at the pictures 
was surreptitiously observed, timed by 
a stopwatch, and recorded. Within a 
50-second time limit per item, Ss spent 
significantly more time looking at the 
pictures of novel animals (45.5 sec- 
onds) than at the pictures of common 
animals (31.1 seconds). A two-tailed 
t test showed a .04 probability value 
for the difference in time. 

The findings of the auxiliary study 
can be applied to the experiment 
proper, although there were some dif- 
ferences in procedure. Instead of be- 
ing free to divide their time for 50 
seconds between the pictures and the 
paragraphs as in the auxiliary study, 
Ss were exposed to the pictures for 10 
consecutive seconds. The Ss invariably 
spent this entire span of time examin- 
ing the illustration. The exact amount 
of time. an S looked at the animal in 
an illustration as opposed to its back- 
ground cannot be determined. The 
findings of the auxiliary study, how- 
ever, suggest that novel animals 
within their settings were attended to 
proportionally longer than common 
animals within their settings. This 
assumption had been further safe- 
guarded by giving Ss a set for ex- 
amining the animals within the 
illustration. The short series of multi- 
ple-choice questions (mentioned in the 
Procedure section) were concerned 
only with the appearance of the ani- 
mal and in no way involved in the 
settings. A common animal, by virtue 
of its familiarity, is unlikely to sus- 
tain interest and attention. (For ex- 
ample, it only takes a moment to 
recognize and label a dog.) Therefore, 


' it is assumed that Ss spent less time 


examining the backgrounds of a novel 
animal than of a common animal. 


Intentional and Incidental Learning 


Scores were tabulated according to 
their relation with novel or common 
animals. Thus each S had a novel con- 
dition learning score and a common 
condition learning score, both for in- 
cidental and intentional tasks. Novel 
scores—íor all Ss in both groups— 
were compared to common scores. 
Since all extraneous factors between 
the two conditions had been con- 
trolled, direct statistical comparison 
was made by means of two-tailed ¢ 
tests. The formula used for the t 
tests, however, took account of the 
size of the coefficient of correlation 
between the two sets of scores. The re- 
sults are summarized in Table 1. 

Although the intentional task scores 
covered a wide range under both con- 
ditions, they were nonetheless highly 
correlated. The substantial size of the 
correlation between the sets of scores 
(.77) is not surprising considering that 
all the completion tests were of com- 
parable difficulty and that the same 
intellectual ability is invested in the 
tests under both conditions. It is not 
an unexpected finding, therefore, that 
Ss who scored high on the completion 
tests associated with novel animals 
also scored high on the completion 
tests associated with common animals. 

A more interesting and unexpected 
finding, however, concerned the effect 
on intentional learning of curiosity 
arousal. A two-tailed ¢ test shows a 
significant improvement in intentional 
task learning under the novel animal 
condition as opposed to the common 
animal condition. 

The fact that the intentional task 
learning was significantly improved is 
a good indication that a state of 
arousal was operative. Improved cen- 
tral task performance is & standard 
criterion for increased motivation 
(Easterbrook, 1959). The Ss were 
better motivated to learn, although 
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the single fact of improved intentional 
task performance alone does not indi- 
cate the identity of the motivation. 
The nature of the motivation is per- 
fectly clear, however, when the facili- 
tated intentional learning is consid- 
ered together with the results of the 
auxiliary study ranking the animals 
for interest and the auxiliary study 
demonstrating Ss' preference for look- 
ing at novel animals. All these findings 
demonstrate that Ss were motivated 
by curiosity, and constitute the opera- 
tional criteria in this experiment for 
curiosity arousal. 

The results on incidental learning 
clearly point to significant differences. 
The direction taken by the differences 
indicates that there is increased inci- 
dental learning with curiosity arousal. 


TABLE 1 
Comparisons OF INCIDENTAL AND INTEN- 
TIONAL RECALL OF MATERIALS PRE- 
SENTED WITH NOVEL AND COMMON 


ANIMALS 
Means and SD 
Variable |__| 4 r 
Novel | Common 
Intentional task re- 
caljbee 
M 14.75 13.58 |2.72 7 
SD 3.85 4.77 
ooo 
Total incidental re- 
call°** 
M 4.21 3.25 | 4.05 38 
SD 1.07 1.37 
Setting recalld** 4 
M 2.81 2.37 | 2.75 60 
SD 1.27 1.33 
Color recalld* 
M 1.40 .88 | 2.36 | —.07 
SD 1.01 -73 
Note.—N = 52. 


® Correlation between recall scores for the “novel” 
vs. the “common” component of each variable. 

b A score of 30.0 is perfect for each condition. 

* A score of 10.0 is perfect for each condition. Chance 
expectancy for each condition is 2.0. 

4 A score of 5.0 is perfect for each condition. Chance 


The direction is maintained by both 
components of the total incidental 
learning score. 

Curiosity arousal facilitated the re- 
eall of settings. The correlation co- 
efficient for the recall of settings when 
paired with a novel animal versus re- 
call when paired with a common ani- 
mal was high, indicating that Ss who 
performed well under one condition 
tended to perform well under the 
other condition also. Thus, the co- 
efficient of .60 may be considered an 
alternate-form reliability coefficient 
for five items of the recall test. By 
the Spearman-Brown formula, the 
probable reliability of 10 items would 
be .75. 

Differences in recall were main- 
tained by the color variable. Recall 
under the novel animal condition was 
significantly higher than the common 
animal condition, despite the fact that 
the two sets of scores showed virtually 
no correlation. Had the correlation 
been as high as the one existing be- 
tween the novel and common condi- 
tions for the recall of settings, then 4 
comparably high value for ¢ would 
have been shown. 

Since both components of the total 
incidental learning score showed sig- 
nificant differences in the same direc- 
tion, it is to be expected that the total 
of these sets of scores would produce 
an even smaller probability value. 
Indeed, the £ score for the differences 
between the novel and common condi- 
tions produced a probability value far 
beyond the .01 level. The novel ani- 
mals thus appear to greatly facilitate 
total incidental learning. 

The results of the study offer cleat 
evidence of differential effect on inti- 
dental learning by curiosity arous 
Furthermore, curiosity arousal servey 
to increase incidental learning whethe! 
the cues are relevant or irrelevant t 
the eliciting stimulus. 
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Discussion 


The findings on the facilitation of 
intentional task learning were unex- 
pected—indeed, somewhat surprising 
—since the intentional task material 
was not spacially congruent with the 
curiosity-eliciting stimulus. The pic- 
ture of the novel animal was on the 
left-hand page while the paragraph of 
text was on the right-hand page. 
Moreover, the picture and the text 
material were never seen simul- 
taneously. Despite the physical dis- 
tance involved, however, Ss seemed to 
be motivated to examine the written 
material about the strange animals 
with more enthusiasm than that in- 
spired by the familiar animals. 

With regard to learning theory, 
some who espouse the Hullian ap- 
proach (Myers & Miller, 1954) have 
felt that exploratory behavior, merely 
because it is externally elicited, should, 
like pain, be no differently treated 
than other drives. The present find- 
ings, however, indicate differences be- 
tween curiosity and aversive drives, 
such as pain, with regard to incidental 
learning that call for different predic- 
tive formulations. 

The effect on incidental learning of 
a variety of motivating conditions 
has been described by Easterbrook 
(1959) and again by Kausler and 
Trapp (1960). The results of the pres- 
ent study, however, suggest that the 
generalization proposed by Easter- 
brook concerning the relation between 
incidental learning and drive arousal 
may be meaningfully extended. 

The hypothesis presents itself that 
the effects of arousal on incidental 
learning are dependent upon the na- 
ture of the drive aroused. Although 


aversive or noxious stimulation tends 
to reduce the range of cue utilization, 
arousal conditions characterized by 
positive affective tone and interest 
may increase the range of cue utiliza- 
tion. Such a hypothetical generaliza- 
tion may be stated in Woodworth’s 
(1958) terms as follows: Arousal of 
need-primacy motives decreases inci- 
dental learning, whereas arousal of 
behavior-primacy motives facilitates 
incidental learning. 
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SELECTIVE FACILITATIVE EFFECTS OF INTERSPERSED 
QUESTIONS ON LEARNING FROM WRITTEN 


MATERIALS 


ERNST Z. ROTHKOPF Ax» ETHEL E. BISBICOS" 


Bell Telephone Laboratories, Incorporated 
Murray Hill, New Jersey 


Does the use of restricted categories of questions, incorporated in 
written instructional material, facilitate learning of restricted categories 
of text content? High school students (N = 252) saw a 36-page passage 
with 2 questions per 3-page zone. Questions differed in location (before 
or after the relevant segment) and in required response. Different 
treatments saw questions restricted to the following answer types: 
(a) either a quantitative term or name; (b) a common English or a 
technical word; (c) a mixture of (a) and (b). The experimental 
hypothesis was confirmed. Learning of several categories of text con- 
tent, as measured on a posttraining retention test, was facilitated by 
appropriate questions seen immediately after exposure to the relevant 
text segment. The results were interpreted to be a consequence of 


modification of inspection behavior. 


Several experimental investigations 
have shown that, under certain condi- 
tions, the incorporation of questions 
and other test-like features in written, 
instructional material increased the 
amount learned from the text (e.g., 
Hershberger, 1964b; Keislar 1960). 
Rothkopf (1963, 1965, 1966), and 
Rothkopf and Coke (1963) have re- 
ported that the facilitating effects of 
questions have at least two compo- 
nents. These are: (a) direct instruc- 
tive effects, that is, questions are 
informative; and (b) general, atten- 
tion-like effects, that is, effects on the 
inspection (studying) behavior of the 
student. The second effect was ob- 
served to be a function of where 
questioning occurred in the stimulus 
materials. Periodic questions about 
material which had already been seen 
were found to result in more effective 
inspection (or mathemagenic)? be- 

1 We wish to thank our colleague Esther 
Coke for her help in the computer analysis 
of the data. 

2The roots of this useful, but long word 
are mathema: learning, that which is learned 
and gignesthai: to be born. This seems an 
appropriate reference to a class of responses 
which give birth to learning (see Rothkopf, 
1965). 
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haviors than periodic questions about 
matter which subjects (Ss) were about 
to study. 

The purpose of the present experi- 
ment was to find answers to the fol- 
lowing: (a) Does the use of restricted 
categories of questions in instructive 
text result in inspection behaviors 
which facilitate learning of restricted 
categories of text content? (b) Do fa- 
cilitative effects which are attributa- 
ble to inspection behaviors change 8$ 
a function of increased exposure to 
the experimental questions? 

A technique which had been devel- 
oped in an earlier study (Rothkopf, 
1966) was used to separate direct 
informative effects of questions from 
effects on inspection behavior. A num- 
ber of short-answer questions were 
composed for a lengthy factual pas- 
sage. These questions were divide 
into two equal subsets, A and B. Sub- 
set A was used as experimental ques- 
tions during study. Subset B was ad- 
ministered after training to measure 
how much was learned. However, the 
two subsets of questions were selecte! 
so that there was no observable direct 
transfer of training from mastery ° 
the subject matter on which Subset 
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A was based to the subject matter 
from which Subset B was derived. It 
was assumed that experimental verifi- 
cation of this relationship between A 
and B implied that any changes in 
Subset B performance, which result 
from exposure to the questions of Sub- 
set A, were not due to the informative 
aspects of questions of Subset A. 


MzrHoD 


Materials 


Two chapters from Rachel Carson's book, 
The Sea Around Us, were used. The ex- 
perimental passage was approximately 9000 
words long. It dealt with animals and min- 
erals found in the ocean. The material, al- 
though topically related, was composed of 
a series of relatively independent factual 
segments, 

The experimental passage consisted of 
36 typewritten pages which were divided 
into 12 three-page zones. Hight short-answer 
questions were composed for the material 
in each three-page zone. The eight questions 
for each zone could be subdivided into four 
categories of two questions each according 
to the character of the correct answer. The 
categories were: (a) common phrases (C), 
any common nontechnical English words, 
for example, red or sense of touch; (b) 
technical phrases (T), scientific or technical 
words, for example, bathyscaphe or photo- 
trophic, all of these were of either Greek 
or Latin origins and could be broken into 
meaningful component roots; (c) measures 
(M), quantities of size, numerosity, dis- 
tance, as well as dates, for example, 2 ton 
or 1942; and (d) names, (N), any proper 
geographical or personal names or any 
other name-like word which had been as- 
signed more or less by fiat, for example, 
Johnson, Philippines, or Jesuit. A sample 
question for each of the above categories is 
given below. 


"Permission for the experimental use of 
these copyrighted materials was kindly 
granted by the publishers, Oxford Univer- 
sity Press, 417 Fifth Avenue, New York 16, 
New York, 

. * The correct answer, and its classifica- 
tion according to experimental type, follows 
each sample question in parenthesis. (a) 
The small whale which is popularly called a 
porpoise in America is really the 


Four questions from each zone, one from 
every answer category, were used to make 
up the 48-item criterion test. This test was 
administered after the completion of train- 
ing and was used for the main experimental 
comparisons. The remaining 48 questions 
(ie. four questions per zone) were used as 
the experimental questions (EQs) which 
were located in various portions of the 
stimulus material. 


Experimental Treatments 


The seven experimental treatments are 
summarized in Table 1. The NOEQ group 
saw no experimental questions. All of the 
remaining treatments saw two experimental 
questions per three-page zone, but differed 
in type and location of experimental ques- 
tions. Two of the treatments (SBNM and 
SANM) saw identical sets of experimental 
questions all of which required measured 
quantities (M) or names (N) for answers. 
For SBN M, however, the two experimental 
questions about a given zone were presented 
immediately before the three-page zone was 
to be read (SB — shortly before). While 
for SANM, experimental questions were 
presented immediately after (SA — shortly 
after) the zone from which they were taken. 
In a like manner, SBCT and SACT were 
exposed to the identical set of experimental 
questions, all of which required either a 
common (C) or & technical word (T) or 
phrase for a correct answer. But in SBCT 
the experimental questions preceded their 
zone, while for SACT they followed. Finally 
SAMX (shortly after-mixed) and SBMX 
(shortly before-mixed) received the same 
set of experimental questions which was 
composed of a mixture of questions of all 
four types, selected so that each of the four 
types of experimental questions was used 
exactly three times in each half of the exper- 
imental text (six zones). The experimental 
orders for SAMX and SBMX were ar- 
ranged so that each of the 48 questions in 


dolphin (bottle-nosed; common phrase). 
(b) The decaying organic matter on the 
bottom of the stagnant Norwegian fiords 
produces the chemical called —— — —— 
which poisons the deeper waters (hydrogen 
sulphide; technical phrase). (c) The con- 
centration of bromine in the Dead Sea is 
— — times as great as it is in the 
ocean (100; measure). (d) The building of 
the bathyscaphe was first proposed by the 
Swiss physicist, Professor Auguste —— — 
(Piccard; proper name). 
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TABLE 1 
ExPERIMENTAL DESIGN 
Number Location of 
lofexperi-| Type of experiment 
mental | experimental questions 
Treatment | questions questions relative to 
per three- used relevant 
page zone| three-page zone 
SBCT 2 C,T shortly before 
SBNM 2 NM shortly before 
SBMX 2 C, T, N,M | shortly before 
SACT 2 C,T shortly after 
SANM 2 N,M shortly after 
SAMX 2 C,T, N,M | shortly after 
NOEQ 0 — =- 


the experimental question pool was used an 
equal number of times over all Ss. 


Procedure 


The experimental procedure was identi- 
cal to that used in an earlier study and has 
been described in detail elsewhere (Roth- 
kopf, 1966). All experimental materials, 
including the directions for their use, were 
bound in a loose-leaf notebook which was 
handed to S as he entered the room in which 
the experiment was conducted. The Ss en- 
tered individually and worked at one of the 
10 tables which were widely spaced in a 
21- X 60-foot room. Each table was par- 
big screened from the view of adjacent 

s. 

Every S worked through the materials 
at his own rate. They were instructed to 
record the time they started and finished 
each page from a direct-reading clock lo- 
cated at their work table. When S finished 
reading a page of text or an experimental 
question, he tore it from the notebook and 
dropped it through a slit into a ballot-box- 
like container. 

The criterion test was administered im- 
mediately after training. Four forms of the 
test were used. These differed only in the 
Sequence of the 48 questions. 


Subjects 


Paid volunteers (N — 252) from Grades 
10, 11, and 12 of Governor Livingston Re- 
gional High School of Berkeley Heights, 
New Jersey, served as Ss immediately 


5 We wish to express our sincere thanks 
to W. M. Davies, Superintendent, Union 
County Regional High School District No. 
1, as well as to F. Aho, Principal, and J. S. 
Stein, both of Governor Livingston High 
School, for their help in procuring student 
volunteers for this study. 
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after the end of their regular school day. 
Thirty-six were assigned in a nonsystematic 
manner to each of the seven experimental 
treatments. A group of 23 additional Ss 
were used to evaluate the direct instructive 
(transfer) effects of EQs on the skills re- 
quired for high criterion test performance. 


RESULTS 


The absence of transfer between 
subject matter skills inducing high 
performance on experimental ques- 
tions and those underlying perform- 
ance on criterion test questions were 
evaluated by the following procedure. 
A group of 23 Ss, from the same popu- 
lation as the experimental groups, 
practiced giving the correct answers to 
experimental questions by a modifica- 
tion of the anticipation method (for 
details see Rothkopf, 1966). Fol- 
lowing mastery of all questions, one 
of the forms of the criterion test was 
administered. The scores were com- 
pared with those of another form of 
the criterion test which had been ad- 
ministered prior to practice on the set 
of experimental questions. 

The mean of the observed pretrain- 
ing scores was 3.17 correct responses. 
The posttraining average was 3.04. 
The net change of —.13 offers no 
ground for rejecting the hypothesis 
that the skills which are required to 
answer the experimental questions do 
not transfer to performance on the 
criterion test. These findings indicate 
that any improvement in criterion test 
performance associated with exposure 


to experimental questions was not due ; 


to the direct informative effects of 
these questions. 

Questions on the criterion test were 
divided into four numerically equal 
categories according to a 2 x 2 table 
with the following classifications: 
Factor 1, correct answer type; (4) 
answer either measures (M) or prope! 
names (N), or (b) answer either com- 
mon words (C) or technieal phrases 
(T). Factor 2, location in the expert 
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mental text of the material on which 
the question was based, that is, either: 
(a) pp. 1-18, or (b) pp. 19-36. Mean 
correct responses to criterion test 
questions of the two classes of answer 
types are plotted as function of loca- 
tion of text source, for the seven ex- 
perimental treatments, in Figure 1. 
These results may be summarized as 
follows: 

Treatments in which experimental 
questions were administered after ex- 
posure to the relevant text segment 
resulted in better overall retention test 
performances than the corresponding 
treatments in which experimental 
questions were administered just be- 
fore inspection of the relevant test seg- 
ment (SANM > SBNM, SACT > 
SBCT, SAMX > SBMX). This effect 
is weak for SACT. The administration 
of experimental questions after expo- 
sure to the relevant text segment pro- 
duced higher retention test perform- 
ance than the NOEQ condition. But 
experimental questions just prior to 
inspection of the appropriate text seg- 
ment resulted in performance which 
was not distinguishable from this con- 
trol (NOEQ) condition. 

These conclusions were supported 
by the Type VI analysis of variance 
(see Lindquist, 1953, pp. 292-297). 
Treatments F was significant (F = 
3.09, p < .01, df = 6, 245). Compari- 
son of overall treatment means by t 
test yielded the following results: 
SAMX:SBMX, t — 208, p « .05, 
SACT:SBCT, t = .65, p > 50, 
SANM:SBNM, t = 3.03, p < .01. The 
t comparisons between each of the 
three SA treatments and the NOEQ 
group were all significant beyond the 
05 level. 

The effect of the SA treatments 
tends to be more marked for materials 
from the second half of the experimen- 
tal text than for the first. The facili- 
tative effects which result from ques- 
tions apparently grow with increased 


X CORRECT RESPONSES 


M AND N QUESTIONS 


C AND T QUESTIONS 


1-18 19-36 1-18 19-36 


PAGE NUMBER OF QUESTION SOURCE 


Fic. 1. Mean correct responses for cri- 
terion test questions of the two answer 
categories as function of location of the 
question source in the text. 


exposure to the experimental questions 
at least for some items. This conclu- 
sion was supported by the significant 
interaction between Treatment X Lo- 
cation of criterion test question source 
in the text (F = 2.70, p < .05, df = 
6, 245). This finding is consistent with 
the interpretation that EQs in the SA- 
treatments tend to shape increasingly 
effective inspection behaviors. 


Selective Facilitation Effects 


Selective effects of experimental 
questions can be inferred where fa- 
cilitation is more pronounced on cri- 
terion test items with answer types 
that have been represented in the set 
of experimental questions used in 
training. The results tend to support 
the selective facilitation hypothesis. 
This conclusion was reached in the 
following way: 

Both the names:measures and com- 
mon:technical phrases data were or- 
dered in a manner consistent with the 
selective facilitation hypothesis al- 
though the effect was small and unre- 
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liable for common and technical 
phrase items. 

The SAM X and SANM treatments, 
both of which were exposed to meas- 
ures and name EQs during training, 
resulted in the largest average number 
of correct responses on measures and 
name items. The effect is much more 
pronounced for items based on the 
second half of the experimental text. 
Comparisons of average second-half 
measure and name scores with the 
control groups yielded f's of 1.98 for 
SANM (p « .05) and of 1.75 for 
SAMX (p < 40). 

Treatments SAMX and SACT, 
whieh had common and technical 
phrase items included in their exper- 
imental questions, resulted in a 
greater number of correct responses 
in the common and technical phrase 
category than the other treatments. 
This ordering was mainly due to tech- 
nical phrase items. But the differences 
were so small that they were very 
likely due to change. 


e——e SA 
o-—--o SB 
X*----X NOEQ 


X INSPECTION TIME (SECS/100 WORDS) 


Cer Tie) 8 9 10 M 12 


4" B 
THREE-PAGE ZONES 

Fic. 2. Inspection time in average num- 
ber of seconds per 100 words of text for 
the 12 three-page zones. (For the purpose 
of graphic simplicity, the data for the three 
SA treatments was pooled. So was the data 
for the three SB conditions.) 


Inspection Time 


Analysis of variance indicates no 
reliable differences in mean inspection 
time per page among the various 
treatments (F — .80, df — 6, 245). As 
can be seen in Figure 2, inspection 
time per page drops markedly as a 
function of the number of pages read. 
Average inspection time per page was 
significantly shorter for pages 19-36 
than pages 1-18 (F = 147.89, p < 
.001, df = 1, 6). This finding is in 
keeping with previous observations 
(Rothkopf, 1966), and suggests an 
extinction-like process which can pro- 
gressively weaken inspection behavior 
during the course of study. 


Discussion 


This experiment provides further 
evidence that facilitative effects can 
be produced by questions that are 
presented immediately after inspect- 
ing the appropriate text segment (SA 
questions). Questions presented before 
inspection of the source segment (SB 
questions) were not found to produce 
these results. 

For reasons stated in the introduc- 
tion, the facilitative effects which were 
observed in this experiment have been 
identified with a class of activities 
called inspection behaviors or mathe- 
magenic behaviors (Rothkopf, 1963, 
1965). The results provide support 
for the hypothesis that the use of SA- 
type questions can modify mathema- 
genic behaviors so as to facilitate 
learning of specific subsets of the writ- 
ten experimental materials. A study 
by Hershberger (1964a) was equivo- 
cal about selective facilitative effects 
of questions. 

The present experiment is not de- 
cisive as to the exact mechanism by 
which mathemagenic behaviors are 
modified by SA questions. The most 
plausible conception is a kind of adap- 
tive evolution of mathemagenic be- 
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haviors, with questions providing se- 
lective contingencies. Mathemagenic 
behaviors, according to this view, are 
extinguished and dropped if they do 
not result in learning the skills nec- 
essary to answer the experimental 
questions, On the other hand, mathe- 
magenic behaviors which preceded 
successful performance on experiment, 
questions would be strengthened. As a 
consequence, the SA treatments would 
be more likely to result in the acqui- 
sition of adaptive (i.e., successful) 
mathemagenic responses than the SB 
or the NOEQ conditions. This concep- 
tion would also explain why the effect 
of the SA treatments tend to be more 
pronounced for the second half of the 
experimental text, at least for some 
items. Inereased exposure to experi- 
mental questions in the SA treatments 
inerease the likelihood that successful 
mathemagenic behaviors have evolved. 

It should be noted that the two cat- 
egories of questions used in this exper- 
iment (ie., names and measures ver- 
sus common and technical phrases) 
are not equivalent. Questions requir- 
ing names, measures, or technical 
phrases for responses can be derived 
from only a limited number of sen- 
tences in the experimental passage 
while a question requiring common 
words as responses can be formed from 
nearly every sentence. The notion of 
selective facilitation presupposes the 
Possibility of specifically adaptive 
mathemagenie maneuvers and this is 
quite difficult to conceive for the com- 
mon word-type item. Consequently, 
it was not unexpected that facilitation 
effects for questions requiring common 
or technical phrase answers were 
found to be weak and were almost en- 
tirely due to the technical phrase ques- 
tions, 

The present experiment indicates 
that test-like events such as questions 
can facilitate learning. However, it 


must not be overlooked that Roth- 
kopf and Coke (1963,1966) have 
shown that, under certain conditions, 
test-like events may shape mathema- 
genic behaviors which are contrary to 
instructional objectives. It is therefore 
important to avoid the conclusion that 
the effect of test-like events is always 
educationally desirable. 
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VARIATION IN THE AMOUNT OF IRRELEVANT 
CUES IN TRAINING AND TEST CONDITIONS 
AND THE EFFECT UPON TRANSFER. 


ROBERT L. R. OVERING 
McGill University 


AND 


ROBERT M. W. TRAVERS 
Western Michigan University 


The principle of refraction was taught to 96 Ss (52 males, 44 females) 
under 1 of 2 conditions: (a) where a considerable amount of irrelevant 
visual information accompanied the instruction (TC+), and (b) where 
as much irrelevant visual information as possible had been eliminated 
(TC—). % of each group was then tested for transfer under 1 of 2 test 


conditions: (a) in the presence of 


considerable irrelevant information ; 


(b) in the absence of all possible irrelevant information. Ss trained 
under the TC+ condition showed no difference between test conditions. 
However, Ss trained in the TC— condition showed significantly better 
performance on the test in which irrelevant information had been 
reduced than on the test possessing irrelevant information. 


An experiment previously reported 
by Overing and Travers (1966) , build- 
ing on the work of Judd (1908) and 
Hendrickson and Schroeder (1941), 
indicated that the manner in which 
a principle is taught affects the 
amount of transfer in a task requiring 
application of that principle. The task 
involved the application of the princi- 
ple of refraction to the understanding 
of the bending of a beam of light at 
the surface of water. The results sug- 
gested that a “realistic” treatment 
with all its irrelevant cues was signifi- 
cantly better in facilitating transfer 
than was a “visually compressed” 
treatment which had many of its ir- 
relevant cues removed. The designa- 
tion of cues as relevant or irrelevant 
is a somewhat arbitrary matter. The 
criterion used here was subjective and, 
hopefully, logical. Those elements 
deemed to be essential (e.g., light 
beam and water) were left untouched, 
while those elements deemed to be ir- 
relevant (e.g., shape of tank, ancillary 


*This article is based on research spon- 
sored by the New Educational Media 
Branch, United States Office of Education, 
under Project C-997. Complete details of 
the procedures described can be found in a 
doctoral dissertation by R. L. R. Overing 
which includes this and other studies. 


equipment) were eliminated or ob- 
scured. 

In the previous experiment (here- 
after referred to as Experiment I), the 
realistie treatment consisted of an 
actual demonstration of (a) the re- 
fraction of light rays entering and 
leaving water, and (b) the apparent 
displacement of an underwater object, 
when viewed at an angle from above. 
For the present demonstration a three- 
dimensional physical apparatus was 
used, and the verbal components of 
the instruetion were transmitted by 
prerecorded tape. For the visually 
compressed treatment the same tape 
was used, thus ensuring that the verbal 
information was constant for each 
treatment, but the visual components 
of the demonstration of refraction and 
displacement were taught using life- 
sized, four-color line-drawings, de- 
signed to exclude all information or 
cues judged to be irrelevant. 

One interpretation of the results of 
the previous experiment is that train- 
ing with compressed information fails 
to give the learner experience in deal- 
ing with situations involving many 
irrelevant cues. This interpretation 
suggests that the person who learns to 
understand a principle in the absence 
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of irrelevant cues has more difficulty 
in applying the principle to a situa- 
tion involving many irrelevant cues 
than he would have had if the original 
learning had been undertaken in the 
presence of irrelevant cues. The sug- 
gestion is that the advantage of learn- 
ing under realistic conditions is that 
such conditions generally involve 
many irrelevant cues. Simplified situa- 
tions, such as are presented by dia- 
grams, do not involve irrelevant cues 
and, hence, may provide inefficient 
teaching situations. The study which 
follows represents an attempt to in- 
vestigate further the extent to which 
learning either with or without irrele- 
vant cues facilitates transfer to new 
situations which also may be either 
with or without irrelevant cues. 


MzrHop 


Design and Materials 


The experiment involved two 1 X 2 X 2 
X 2 X 12 factorial designs, each comprising 
one training condition, two sexes, two test 
conditions, two tests within each test con- 
dition and twelve repeated measures within 
each of the two tests. 


Training Conditions 


_, Of the two training conditions, one was 
identical to the realistic condition of the 
earlier experiment (ie. the demonstration 
of refraction and displacement was con- 
ducted with actual physical apparatus and 
no effort was made to control for the pres- 
ence of irrelevant cues) and for purposes of 
this experiment is designated here as Train- 
ing Condition Positive (TC+). The second 
training condition, designated Training 
Condition Negative (TC—), used identical 
apparatus to TC+ except that every effort 
Was made to remove or minimize irrelevant 
cues. To this end large black screens were 
built to screen off much of the apparatus 
and to provide a neutral background. Sec- 
tions of the apparatus were coated with 
flat black paint to eliminate reflections and 
to minimize edges and surfaces of the ap- 
Paratus. All ancillary apparatus such as 
light Source, water pail, dust-making ap- 
Paratus (used to make light beam show up 
m air) were all partially or completely 


screened from the Ss. The S viewing the 
demonstration in this training condition 
could see little except the light beam pass- 
ing through the air and water. 

In both the negative and positive train- 
ing conditions, the identical sound track 
transmitted the verbal component of the 
instruction, this sound track being the same 
one which had been used for the realistic 
treatment of Experiment I. 


Testing Conditions 


After training, Ss were exposed first to 
one of two test conditions. This first test, 
regardless of condition, is referred to as 
Test 1. They were then exposed to one of 
two conditions in a second test, referred to 
as Test 2, Both Tests 1 and 2 involved the 
problem of aiming a pistol correctly at an 
underwater target which, because of the 
refraction of light, did not really lie where 
it appeared to lie. 

Of the two test conditions, the one desig- 
nated Test Condition Positive (Test +) 
was very similar to the test conditions of 
the original experiment. It was, however, 
somewhat embellished in that realistic pis- 
tols were substituted for the former plain 
wooden guns, new target objects suggestive 
of real organisms (a fish and a tadpole) 
shown in Figure 1 replaced the former sim- 
ple cross and circle, and the numbers on 
the plastic grid (used to permit Ss to call 
off their point of aim) were represented 
both in red and blue, as opposed to being 
merely blue in the original experiment. 
Thus a number of new irrelevant conditions 
were introduced into this training condition. 

In the other test condition, designated as 
Test Condition Negative (Test—), Ss were 
exposed to several modifications designed 
to eliminate or minimize all possible irrele- 


HIT— — BULLSEYE THIS 1$ WHAT 
YOUR TARGET. 
TEST I— POSITIVE WILL LOOK 
LIKE. 
HIT 
HIT: BULLSEYE 
TEST IL — POSITIVE 
HIT 


Fio. 1. Targets used in Tests 1 and 2 of 
Test Condition Positive as they appeared 
on the mimeographed sheets handed to S 
and as they appeared on the bottom of the 


tank. 
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BULLSEYE 
HIT. THIS IS WHAT YOUR 
pad TARGET WILL LOOK 
TEST I = NEGATIVE LIKE, 
HIT. 


HIT 
TEST IL —NEGATIVE ‘BULLSEYE 
HIT 


Fic. 2. Targets used in Tests 1 and 2 of 
Test Condition Negative as they appeared 
on the mimeographed sheets handed to S 
and as they appeared on the bottom of the 
tank. 


vant cues. First, the tests of the negative 
condition were conducted in  darkened 
rooms, so that as little as possible of the 
general surroundings were visible; second, 
the size and shape of the test tanks were 
obseured by covering the tanks with black 
cloth, in which was cut an aperture just 
large enough to permit the numbered por- 
tion of the grid to show. The target pistols 
were the simple wooden pistols of the first 
experiment, but were now covered with flat 
black paint. The target objects, as before, 
were the simple cross and circle shown in 
Figure 2. The only source of light was from 
a lamp placed under the black cloth so as to 
shine directly into the tank. Some light, 
naturally, escaped through the grid aperture 
and dimly illuminated the surroundings. 

As in the previous experiment, each test 
condition comprised two tests, designated 
Test 1 and Test 2. However, the difficulty 
of the two tests in this second experiment 
was increased by changing the water depth 
in the tanks and altering the angle of in- 
cidence of the gun to the water to increase 
the visual displacement between the actual 
target and the apparent target. The main 
purpose of this increased displacement was 
primarily to see if increasing the difficulty 
of Test 1 would modify the Test 1 results. 
In Experiment I, reported earlier, no dif- 
ferences among training conditions on Test 
1 were found and it was hypothesized that 
this might have been due to the first test 
having been too easy, thus resulting in 
scores which approximated a chance dis- 
tribution. 


Procedure 


The Ss (N = 96) were sixth-grade pupils 
attending two elementary schools in a mid- 
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dle-class section of Salt Lake City. The 
group comprised 52 boys and 44 girls. As in 
the first experiment, Ss were preselected in 
the sense that those with IQ's below 100 or 
over 130 on the Pintner Intermediate Test 
were excluded. Pupils were randomly as- 
signed to each experimental condition. 

The procedure followed was very similar 
to that of the first experiment. Each of the 
four groups, each group comprising 13 males 
and 11 females, reported to the teaching 
station where they were exposed either to 
the negative training condition or to the 
positive training condition. Upon comple- 
tion of the teaching program the group was 
then divided and one-half of the Ss re- 
ported to the Negative Test Station and 
the other half reported to the Positive Test 
Station. Ss entered the Test 1 area of their 
respective stations individually, and were 
given a printed sheet of directions which 
read as follows. 

You have just learned that under- 
water objects, when viewed at an angle 
from above, do not really lie in the 
exact position that they appear to lie. 

Remembering this fact, your prob- 
lem, now, is to aim the pistol so that a 
bullet from it would strike the under- 
water target. The depth of the water 
over this particular target is 594 inches. 

The plastic grid over the target has 
numbered lines marked on it. These 
lines will enable you to report your 
point of aim to the attendant. 

When you have reported your point 
of aim, the examiner will say one of the 
following: “HIT,” “MISS,” or “BULLS- 
EYE.” 

“HIT” means that you have hit some 
part of the target, either above or be- 
low the center of it. 

“MISS” means that you have not 
struck any part of the target. 

“BULLSEYE” means that you have 
struck the very center of the target. 


TABLE 1 

MEAN IQ or Aut SUBJECTS BY TREATMENT 
oco oen a eis 2 

‘Treatment X IQ SD 
Positive training—positive test 112.58 | 7.89 
Positive training—negative test 115.12 | 8.32 
Negative training—positive test 115.92 | 10.23 
Negative training—negative test 114,21 | 9.49 


Sle ae ean) ae, Se es 
Note.—N = 13 males, 11 females for each treatment. 
Total N = 52 males, 44 females. 
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Your object is to score a bullseye in as 

few shots as possible. You are to go 

on shooting and reporting your shots 
as quickly as possible until the attendant. 
tells you to stop. 

Any questions? 

The instruction sheets to Ss at the nega- 
tive and positive stations were identical 
except that in each case the appropriate 
target was drawn on them. The Ss were in- 
vited to question the attendant about the 
instructions if they were in doubt on any 
point, The Ss then aimed the gun so as to 
attempt to hit the underwater target, indi- 
cating their point of aim by reference to 
the numbered plastic grid, and the attendant 
provided feedback by saying “Hit,” “Miss,” 
or “Bullseye,” whicheyer was appropriate, 
after each shot. The Ss, as before, were per- 
mitted 12 trials in which to achieve cri- 
terion. If they did so before the twelfth 
trial, they proceeded to the Test 2 cubicle 
where they were presented with the follow- 
ing instructions, 

. Your task here is similar to the one 

in the other room. You are to aim the 

pistol so that a bullet from it would 
strike the underwater target. The depth 
of the water over this target is 8⁄2 
inches, that is, it is deeper than in the 
previous task. As before, you will be 
told “HIT,” “MISS,” or “BULLSEYE.” 

Any questions? 

, As with Test 1, the instructions for nega- 
tive and positive conditions were identical 
dt for the appropriate target representa- 

ions. 
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SHOT NUMBER 

Fic. 3, Learning curves for Test 1, Training 
Condition Positive. 
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Fic. 4. Learning curves for Test 1, Training 
Condition Negative. 


RESULTS 


As in Experiment I, each S's scores 
on Tests 1 and 2 were converted to 
error scores by subtracting the value 
of the criterion aiming point from the 
value of the point of aim for each shot. 

Figure 3 shows the learning curves 
which resulted from Test 1, Training 
Condition Positive, and Figure 4 shows 
the Test 1 learning curves resulting 
from Training Condition Negative. 
Figures 5 and 6 show the Test 2 
learning curves for Training Condi- 
tions Positive and Negative, respec- 
tively. 

Examinations of the individual 
learning curves indicated that, as in 
Experiment I, a majority of Ss had 
achieved criterion by Shot No. 7, and 
that the learning curves from Shot 
No. 7 on tended to be weighted heav- 
ily by the scores of one or two Ss who 
had shown little or no learning, and 
who in some cases, by this stage, were 
resorting to wild guessing. On this 
basis the same arbitrary decision was 
made as was made in the earlier ex- 
periment, to run the analyses of vari- 
ance on the scores of the first seven 


shots only. 
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The means and standard deviations 
on which subsequent analyses are 
based are shown in Table 2. Analyses 
of variance of the Test 1 error scores 
showed no differences between test 
conditions positive or negative for 
either of the training conditions. A 
significant difference between male 
and female did, however, occur for 
both Training Condition Positive (F 
= 16.20, p < .001) and Training Con- 
dition Negative (F = 7.35, p < 01). 

The analyses of variance of the Test 
2 error scores, as was the case with the 
Test 1 scores, showed no differences 
between positive and negative testing 
conditions when learning had occurred 
under Training Condition Positive. A 
male-female difference was also evi- 
dent, p < .01 (F = 8.45, df = 1 and 
44) with this positive training condi- 
tion. The results of the analysis of the 
Test 2 scores from the negative, or 
visually compressed, training condi- 
tion are in sharp contrast, however. 
Here, a difference, significant at the 
06 level (F = 3.85, df = 1 and 44), 
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Fic. 5. Learning curves for Test 2, Train- 
ing Condition Positive. 
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Fio. 6. Learning curves for Test 2, Train- 
ing Condition Negative. 


did occur between positive and nega- 
tive testing conditions, and no male- 
female differences were evident. 

In order to have a basis for compar- 
ing the results of Experiment I with 
those of Experiment II a ¢ test was 
run between Training Condition Posi- 
tive, Test Positive and Training Con- 
dition Negative, Test Positive, which, 
in effect, could be considered to be the 
training-testing combinations replicat- 
ing the realistic—visual compressed 
conditions of Experiment I. Although 
the difference between the means was 
in the anticipated direction, (that is, 
TC--, Test+ better than TC —, Test 
+) the results were not significant, 
achieving a p of slightly less than 10, 
on a one-tailed test. 


Discussion 


Examination of the learning curves 
for Test 1 shows that there were DO 
differences in performance between 
those Ss trained under the condition 
with irrelevant information (TC+ 
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TABLE 2 
MEANS AND STANDARD DEVIATIONS BY LEARNING CONDITION, TESTING CONDITION, 
Sex, AND Tesr 1 AND TEsT 2 


Test 1 Test 2 

Irrelevant cues added | Irrelevant cues reduced || Irrelevant added | Irrelevant 
Training (Test 4) (Test —) (Test 4) a crest) tate 
condition 

Male Female Male Female Male Female Male Female 

M| SD|Mj| SD| M | SD| M | SD} M | SD| M | SD| M | SD) M | SD 
Positive 
training |1.54| 1.32 |2.87| 1.87 | 1.57 | 1.16 | 3.89 | 1.88 || 3.55 | 2.48 | 4.19 | 2.42 | 1.71 | 2.85 | 5.73 | 3.69 
Negative 
training |2.03| 1.46 |2.93| 1.57 | 2.17 | 1.23 | 3.51 | 1.45 || 4.59 | 3.08 | 5.40 | 3.53 | 2.84 | 2.71 | 3.80 | 2.43 


and those Ss trained under the condi- 
tion with reduced irrelevant informa- 
tion (TC—). This held true regardless 
of whether Ss were tested under the 
test condition with irrelevant informa- 
tion (Test+) or the test condition 
with reduced irrelevant information 
(Test—). This finding confirms the 
results of Experiment I where no 
differences between conditions were 
observed on Test 1. 

However, as was also true in Ex- 
periment I, differences between cer- 
tain of the training-testing combina- 
tions of Experiment II emerge in Test 
2. These Test 2 results would seem to 
suggest the following conclusions: Ss 
trained under conditions possessing ir- 
relevant information seem able to 
transfer their learning to test situa- 
tions possessing irrelevant informa- 
tion, or alternatively, to test situations 
with reduced irrelevant information, 
equally well. However, Ss trained 
Under conditions of visual compres- 
sion, that is, with reduced irrelevant 
information, are less able to transfer 
their learning to test situations pos- 
sessing irrelevant information than 
they are to transfer learning to test 
Situations with reduced irrelevant in- 
formation. 

Although the results of this experi- 
‘Ment do not appear to confirm the 


findings of Experiment I, namely, that 
a TC+, Test-+ combination is signifi- 
cantly better than a TC—, Test+ 
combination, the reader should note 
that both the Training Condition Neg- 
ative and the Test Condition Positive 
of Experiment II were different from 
Experiment I. However, the significant 
difference between the TC— , Test— 
combination and the TC—, Test+ 
combination do appear to add support 
to the original supposition drawn from 
Experiment I that when irrelevant 
cues are encountered in the testing 
situation which were not present dur- 
ing training, the level of performance 
will be lower. 

The male-female differences evident 
in the Test 1 results of both positive 
and negative training conditions, as 
well as in the Test 2 results of the 
positive training condition, but ab- 
sent in the negative training condi- 
tion, Test 2, require some comment. 
There is some evidence (Reed, 1961) 
that adolescent boys show more of the 
socially expected male interest in sci- 
ence and that they also report signifi- 
cantly more scientific activities than 
do girls. The subject matter taught in 
this present experiment is of a scien- 
tific nature and in addition the partic- 
ular transfer task used appears to be 
by its nature a “masculine” task com- 
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prising activities and apparatus more 
familiar to boys than to girls. Girls, 
accordingly, could be presumed to be 
at a disadvantage which was reflected 
in generally poorer performance. Why 
this difference between male and fe- 
male performance should disappear in 
Test 2 of Training Condition Negative 
is a phenomenon for which we have 
no explanation. Speculation might sug- 
gest that confronted with a learning 
task that was relatively unfamiliar 
and of low interest, females were able 
to learn more from the less distractive 
training condition in which irrelevant 
information had been reduced, and 
thus were able to perform comparably 
to males. What is of interest and sug- 
gestive of further research is the fact 
that having seen this phenomenon in 
Experiment II the authors returned 
to the data of Experiment I and found 


a somewhat similar finding. That is, 
in Test 2 of Experiment I, females 
trained under the condition in which 
irrelevant information had been re- 
duced performed comparably to males. 
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REPLACEMENT AND NONREPLACEMENT STRATEGIES 
IN CHILDREN’S PROBLEM SOLVING 


M. C. WITTROCK 
University of California, Los Angeles 


The replacement and nonreplacement strategies were operationally de- 
fined in problems involving 4 concepts. The 119 2nd graders from 2 
publie schools were individually assigned at random to the 3 treat- 
ments: Replacement, Nonreplacement, and Control. From a “hier- 
archical" model of verbally mediated transfer the following prediction 
was made. On tests of learning, near and remote transfer, from high to 
low the treatment means would rank: Nonreplacement, Replacement, 
and Control. In each of the 4 tests of learning and transfer, the treat- 
ment effect was statistically significant (p < .01) by analysis of variance. 
The predicted rank order among the means occurred in each school in 
each test. Newman-Keuls multiple comparisons tests indicated close 
agreement to the hypothesis. Young children learned a difficult strat- 
egy. They transferred it to new problems, including problems involving 


i 
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only unrehearsed concepts. 


Experiments on the teaching of 
problem solving to children are rare 
in educational psychology. But these 
experiments ean help educational psy- 
chologists to discover knowledge and 
theory about what is possible in teach- 
ing and in transfer of training. 

The experiment reported below in- 
volves two types of problems. First, 
the dependent variable of most inter- 
est, transfer to new concepts, is often 
impervious to instruction, especially 
with young children. Second, an effi- 
cient nonreplacement strategy is diffi- 
cult for young children to learn. 

In two experiments on verbal cues 
(Wittrock & Keislar, 1965; Wittrock, 
Keislar, & Stern, 1964) children were 
taught to transfer to new examples of 
the same concepts learned during 
training. In the instructional treat- 
ment best for producing transfer 
within the previously learned con- 
cepts, the child was cued with the 
correct concept name. Neither the gen- 
eral basis for matching, nor the spe- 
cific, correct answer to each item, 
Which were the cues used in two other 
treatments, was as effective as the 
Correct concept name. But none of the 
treatments produced much transfer to 


new concepts. It was concluded that 
something more than these verbal 
cues, probably a verbal sequence or a 
strategy!, is needed to produce trans- 
fer to new concepts. A strategy is gen- 
eral and should transfer broadly. 

Anderson (1966) taught first grad- 
ers to use a strategy to solve problems. 
However, he used as subjects (Ss) 
only the top third of the IQ distribu- 
tions of the classes in the study. 

From information theory, both 
the replacement and nonreplacement 
strategies used below should be more 
efficient than the “try one at a time” 
strategy Anderson used. In the study 
below, the child is taught to gain one 
bit of information on each succeeding 
trial. Until he solves the problem, on 
each trial he can eliminate half of the 
remaining choices. 

When they are solving problems, 
children and adults often sample with 
replacement. That is, within a prob- 
lem, they sometimes repeat incorrect, 
answers and hypotheses. In their theo- 
retical models of concept learning and 
problem solving, Restle ( 1962) and 


1A strategy is operationally defined as a 
discriminable set of responses to a series of 
stimuli. 
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Atkinson, Bower, and Crothers (1966) 
assume that people often sample with 
replacement. 

A more effieient procedure is to 
sample only from the population of 
answers or hypotheses that has sur- 
vived elimination in previous trials. 
This procedure is called sampling 
without replacement. It should pro- 
duce greater learning and transfer to 
problems involving a pattern of elimi- 
nating incorrect answers, even when 
new concepts replace the ones used 
during training. To build upon the 
studies by Anderson and by Wittrock 
et al. described above, a strategy of 
sampling without replacement was 
taught to the entire IQ distribution of 
two second-grade classes of publie 
School children. 

The hypothesis tested in this study 
is as follows. With young children 
solving problems involving four con- 
cepts, the nonreplacement strategy 
produces learning, near transfer, and 
remote transfer greater than does the 
replacement strategy, followed in turn 
by the control procedure. The non- 
replacement strategy was operation- 
ally defined by having the children 
eliminate in sequence cards which 
named incorrect concepts. The re- 
placement strategy was like the non- 
replacement strategy, except that the 
child was not taught to eliminate the 
cards for the incorrect hypotheses. 

The basis for the hypothesis is a 
“hierarchical” model of verbally me- 
diated transfer. Verbally mediated 
transfer is hypothesized to be a func- 
tion of the hierarchies of verbal as- 
sociations the learner brings to the 
problem. These include his preferences 
for choosing among competing hier- 
archies associated to the problem, the 
dominance of the levels within each 
hierarchy, his habits for sequentially 
selecting among levels within a hier- 
archy, and his habits of sequentially 


selecting responses within a given level 
of a hierarchy. 

Word-association norms provide 
one estimate of superordinate, coordi- 
nate, and subordinate associations 
within hierarchies (cf. Miller, 1951). 
But, in this study, only children's 
strategies for choosing among re- 
sponses all of one level of a hierarchy 
were studied. Therefore, predictions 
about superordinate and subordinate 
associations will not be discussed here. 
Sampling without replacement is an 
uncommon but efficient strategy for 
selecting hypotheses from one level of 
a hierarchy. 


METHOD 


Experimental Design 


A fixed effects model, simple randomized 
design was used. The Ss were individually 
assigned at random to one of three treat- 
ments: Nonreplacement, Replacement, and 
Control. Each S was successively given fa- 
miliarization training, preexperimental train- 
ing, random assignment to one of three 
experimental conditions, experimental train- 
ing, and the posttests. 


Subjects 


The Ss were 119 second grade students, 
66 males and 53 females, the entire second 
grade populations from two Los Angeles 
publie elementary schools? From the origi- 
nal group of 128 children, nine were dropped 
because of frequent absence. The means ano 
standard deviations of the treatment groups 
MA, CA, and IQ, from the California Pub- 
lic School Primary Test of Mental Abilities, 
are presented in Table 1. These measures 
show no particularly divergent means OF 
variances for any of the treatment groups. 


? The author wishes to thank Sue Buckner 
who wrote the instructional materials ani 
gathered the data. Thanks are due to the 
following people who made the study poss 
ble: Thomas Reece, Area Superintendent, 
Elementary Area West, Los Angeles City 
Schools; William Haley, Principal of the 
Marquez Elementary School; and d 
Louise Powell, Principal of the Richlen 
Avenue Elementary School. 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS OF THE 
Treatment Groups’ MA, CA, AND IQ 


Treatment condition MA CA 1Q 
Nonreplacement 96.9 | 84.9 | 114.1 
11.2 | 2.2] 12.1 
Replacement 98.4 | 85.6 | 115.4 
10.2 | 3.5 | 13.6 
Control 94.7 | 85.6 | 111.2 
10.4 | 3.8 | 13.3 
Apparatus 


Each S was seated in one of 10 booths in 
a school room reserved for the experiment. 
He saw slides on a screen at the front of the 
room. The instructions were recorded on 
magnetic tape and played through ear- 
phones to each child. He responded to each 
problem by pushing a button on his two- 
choice response box. Each of the two buttons 
on the response box would automatically 
turn green or red to indicate that a choice 
was correct or incorrect, respectively. 

In addition to the slide projector and 
the response boxes, the apparatus consisted 
of a master control panel, a stereo magnetic 
tape deck, and a paper tape data recorder. 
The slide projector was modified with photo- 
electric cells, which actuated the master 
control panel, displayed for the experimenter 
the correct choice for an item, and provided 
immediate feedback to each child. One chan- 
nel of the stereo magnetic tape provided 
instructions to the children. The other chan- 
nel of the tape provided impulses to con- 
trol the automatic apparatus. Once the 
experimenter loaded the slide projector and 
started the tape, the presentation of stim- 
uli, the control of the length of the re- 
sponse interval, the immediate feedback to 
8, and the recording of the responses were 
all automated. 

Each S had four cards, each with words 
and designs to represent the four concepts: 
color, size, shape, and number. On the in- 
side surface of the front vertical wall of 
his booth were hooks to hang the cards. 
The experimenter manually recorded the 
Placement of cards on the hooks. The data 
Tecorder registered only the button presses. 


Materials 


Each slide presented a matching-to-sam- 
ple stimulus which varied in four ways: 
color, size, shape, and number. The stimu- 


lus to be matched was centered toward the 
top. The two choices for matching were 
below it, one on either side. Each problem 
involved a series of six or more slides, The 
basis for matching one of the two bottom 
pictures to the top picture was always one 
of the four concepts given above. The in- 
stances of each of these four concepts were 
as follows. For color, they were blue, green, 
yellow, and red. For shape they were cat, 
circle, diamond, and house. Large and small 
were used for size; one, two, three, and 
four were used for number. 

The posttests measured learning and 
transfer. The tests of learning were made 
from the materials described above. The 
first transfer test introduced new colors 
(black, orange, purple, and green) and new 
shapes (boat, swan, square, and cone). 

The remote transfer test introduced four 
new concepts. Triangles varied in the di- 
rection of their apexes (up or down), the 
texture of their interiors (dotted or plain), 
their rectangular borders (solid or dotted), 
and the position of a short horizontal line 
inside the border (at the base or at the 
apex of the triangle). 


Procedure 


With the exception of the experimental 
training, the procedure was identical for all 
groups. Each child worked individually in a 
booth, Children participated in groups of 
10. A session lasted approximately 12 min- 
utes. There were 13 days of training. The 4 
days of testing were given immediately af- 
ter training. 

Familiarization training. In one session 
the children were given a series of eight 
problems. Each child was taught to push the 
button which corresponded to the one of 
the two bottom pictures identical to the top 
picture. " 

Preexperimental training. The object 
was to establish associations between the 
concepts and the cards to be used in ex- 
perimental training. The first problems in- 
volved two concepts—color and shape— 
and two instances of each, green and red, 
and circle and cat. For each child, the task 
was to hang on the hook the card which 
illustrated the basis for matching the bottom 
picture to the model picture. After these 
problems, three instances of each of the 
two concepts were introduced. This gradual 
progression of difficulty was followed until 
all four concepts and all given instances, of 
each had been learned. A test over training 
was then given. It consisted of eight prob- 
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lems of six slides each. The children who 
scored 80% or more correct on this test were 
assigned to the experimental training. 

Experimental training. All 119 of the 
children who regularly attended the ses- 
sions successfully completed the preexperi- 
mental training and were individually as- 
signed at random to one of the three 
treatments, The experimental training lasted 
four days and consisted of eight problems of 
six slides each. All problems and all slides 
were identical for all three treatment groups. 
The treatments differed only in the verbal 
instructions, The first slide of each problem 
was always designed to enable the child to 
eliminate half of the four choices. The lower 
left picture would always match the top 
picture in two ways while the lower right 
picture would always match the top picture 
in the other two possible ways. 

The children in the Nonreplacement 
Group were taught to hang the four con- 
cept cards on the four hooks in front of 
them. They hung the cards to represent 
the ways the right and left bottom pictures 
matched the top picture. They then pressed 
buttons to find which of the two lower 
pictures was the correct match. Each child 
then took from the hooks and turned face 
down the two cards which represented the 
incorrect bases for matching. He used the 
two remaining cards on the next slide in 
the set. Again he pressed a button to find 
the correct match, and he turned face down 
the card which did not represent the correct 
basis for matching. The one remaining card, 
which should now be the correct choice, was 
used to match the other slides in the set. 

With one exception, the Replacement 
Group was taught the procedure described 
above for the Nonreplacement Group. No 
mention was made of turning face down 
those cards eliminated when they were in- 
correct. 

In the Control Group, Ss were given the 
procedure described above for the Replace- 
ment Group, but with one exception. They 
were given the cards without instruction 
to hang them underneath the right and left 
choices. They were instructed to find the 
one of four rules for matching. They were 
told to use the cards, if they wished, in any 
way they would like. 

Testing. Four posttests were presented 
immediately after the experimental learning. 
These tests are described above in the 
Materials section. The first test, a test of 
learning, presented the same concepts and 
instances used during experimental train- 
ing. The second test measured transfer to 


new instances of the concepts color, and 
shape. Each child was allowed to use his 
four concept cards on these two tests. 

The third test was identical to the second 
except that the children were not permitted 
to use their concept cards. The fourth test 
measured remote transfer. The concepts and 
instances of this test had never previously 
appeared in the experiment. No cards were 
permitted on this test. 

At the beginning of each of the four 
tests, the children were told all of the four 
possibly correct bases for answering the 
problems. The child’s task was to remember 
these bases and to determine the correct 
one for each problem. At no time was he 
given a cue or strategy during any test. 


Resutts AND DISCUSSION? 


A heterogeneity of variance test 
(Edwards, 1960, p. 105) was run on 
each dependent variable. The assump- 
tion of homogeneity of variance was 
not rejected for any of the four varia- 
bles. A Treatments X Schools analy- 
sis of variance, fixed effects model, 
was performed on each of the four 
dependent variables. The F ratio for 
the treatment mean square was sta- 
tistically significant (p < .01) for 
each of the four dependent variables. 
The F ratios were 7.28 for learning, 
9.47 for transfer to new instances 
with eards, 17.62 for transfer to new 
instances without cards, and 5.62 for 
transfer to new concepts. The schools 
effect was statistically significant 
(F — 485, p « .05) on the transfer- 
to-new-concepts test. There was no 
statistically significant interaction on 
any of the four dependent variables. 

In Table 2 the means and standard 
deviations for the four dependent var- 
iables are listed by treatments within 
each of the two schools. From high to 
low, the predicted rank order, Non- 


*The UCLA Campus Computing Fa- 
cility and the UCLA Biomedical Computing 
Facility were used for the statistical analy: 
ses of the data. The research was finance 
by the Ford Foundation Fund for the Ad 
vancement of Education. 
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TABLE 2 
MEANS AND STANDARD DEVIATIONS OF THE POSTTEST SCORES 
E insfer to Transfer to Transfer to 
T i E i 
testment groups oteo | mengetan | apy inire | GeT eei 
items 40 items 25 items 
School I 
Nonreplacement group 29.4 29.6 30.1 17.8 
4.4 5.2 4.0 3.5 
Replacement group 25.1 24.9 26.4 16.9 
6.3 viet 7.6 5.5 
Control group 22.4 20.6 20.9 14.2 
5.2 4.7 6.5 É 
School II fn 
Nonreplacement group 25.7 25.0 27.7 16.1 
8.5 8.5 6.5 5.0 
Replacement group 22.9 24.0 23.8 14.4 
6.5 6.1 6.2 5.2 
Control group 21.7 20.9 20.1 12.8 
7.2 8.0 6.4 4.7 


Note.—N = 19 for the control group in School I; for all other groups, N = 20. 


replacement, Replacement, and Con- 
trol, occurred on each of the four tests. 
This rank order occurred in each test 
in each school. 

A Newman-Keuls multiple compari- 
sons test, (Winer, 1962, p. 80) was run 
on each of the four dependent varia- 
bles. For the learning test, the Non- 
replacement Group and the Replace- 
ment Group each had significantly 
higher means than the Control Group 
(p < .05). But there was no statisti- 
cally significant difference (p > .05) 
between the Nonreplacement and the 
Replacement Groups. Both experimen- 
tal groups learned better than did the 
Control Group. 

On the transfer-to-new-instances 
test with cards, the mean of the 
Nonreplacement Group was statisti- 
cally greater than the mean of the 
Replacement Group or the mean of 
the Control Group. The Replacement 
Group and the Control Group’s means 
were not statistically significantly dif- 
ferent from each other. The Nonre- 
Placement Group transferred better 
than either of the other groups. 

For the test of transfer-to-new-in- 


stances without cards, all differences 
among the means of the three groups 
were statistically significant (p < 
.05). The Nonreplacement Group was 
highest, the Replacement Group next, 
and the Control Group last. The re- 
sults of this test were as predicted. 

On the last test, transfer-to-new- 
concepts without cards, the mean 
of the Nonreplacement Group was 
greater (p « .05) than either the 
Replacement Group or the Control 
Group. The Replacement and the 
Control Group means were noti statis- 
tically significantly different from 
each other. 

In brief, the results are in close 
agreement with the predictions, espe- 
cially on the transfer-to-new-instances 
test without cards. The children in 
the experimental groups learned to use 
the strategies. They transferred the 
strategies to new problems. Transfer 
was best for the nonreplacement strat- 


egy. 

The most exciting finding was that 
the nonreplacement strategy trans- 
ferred to problems where the cards 
were removed. The absolute mean dif- 
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ferences and the statistical signifi- 
cance of the differences among the 
means increased on these two tests 
compared with the tests of learning 
and transfer-to-new-instances with 
cards. The children not only learned 
to use the strategies, but they could 
transfer them to problems without 
the cards or props, even to difficult 
problems involving transfer to four 
new concepts. 

These results occurred in two public 
school systems using the entire second 
grade population of each school, not 
with a highly selected sample of chil- 
dren. The author’s previous attempts 
at obtaining remote transfer with ver- 
bal cues were not effective. However, 
a nonreplacement strategy (call it a 
verbal sequence if you prefer) was ef- 
fective in this study on the same trans- 
fer tests used in the previous studies. 

The data of this study lend support 
to the hypothesis that strategies can 
mediate transfer to new problems. 
The data also suggest that young 
children can learn and transfer a 
complicated strategy. In this study 
second graders learned a nonreplace- 
ment strategy. They also transferred 
it to contexts substantially different 
from those used during training. 


To develop a theory of instruction, 
data are needed about treatments 
which produce transfer. The teaching 
of problem-solving strategies to young 
children is both a way to obtain 
transfer and to develop a theory of 
instruction. 
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SOME PSYCHOLOGICAL ASPECTS OF SUBJECT- 
MATTER STRUCTURE’ 


PAUL E. JOHNSON* 
University of Minnesota 


A multiple-response verbal association test and a similarity judgments 
test were administered to 24 students of physics. The occurrence of 
words from the subject matter as responses on the association test 
was related to the frequency with which the words appeared in the 
written materials. Response distributions to words were similar for 
those words which were judged to represent similar concepts in the 
subject matter. Differences in achievement were related to the num- 
ber of responses given per word on the association test as well as 
to the extent to which Ss' associations represented constraints in 


the subject matter. 


Many of the concepts in a subject 
matter are defined by means of their 
relations with other concepts in that 
subject matter rather than by the 
presence or absence of specific en- 
vironmental attributes (Carroll, 
1964). This is especially true of 
physies, where much of the logical 
structure consists of interrelations 
among concepts which are specified 
formally by constraints within the 
written materials. Learning such & 
subject matter is in part a case of in- 
ternalizing these relations, and one 
aspect of the psychological structure 
of physics consists of the learned in- 
terrelations among the words used to 
represent its concepts. 

To the extent that a subject matter 
attempts to describe a certain por- 


‘This work was supported in part by 
grants in the University of Minnesota Center 
for Research in Human Learning, from the 
National Science Foundation (GS-41), the 
National Institute of Child Health and 
Human Development (HD-01136-01) and 
from the Graduate School of the University 
of Minnesota. Portions of this paper were 
tead at the annual convention of the Ameri- 
can Educational Research Association, Feb- 
Tuary 1966. 

* The author would like to express his ap- 
Preciation to the chairman and his staff in 
the Science Department at the University of 

innesota Laboratory School for making 
this experiment possible. 


tion of human experience, which most 
do, there are also concepts present 
which are defined in terms of environ- 
mental data. Examples from this 
class of concepts in physics are oper- 
ationally defined quantities such as 
mass, distance, and time. 

The present paper is concerned 
with relationally defined concepts 
within the relatively stable subject 
matter domain of Newtonian Me- 
chanics, and also with nonrelationally 
defined concepts within this domain 
insofar as they are elements of rela- 
tions used to define other concepts. 
The purpose of the paper is to ex- 
amine the associative relations that 
exist among the verbal labels for these 
kinds of concepts and then to con- 
sider the information that such rela- 
tions provide about the structure of 
mechanics for individuals who are 
judged to have reached different 
levels of achievement in the subject 
matter. 

In two previous papers (Johnson, 
1964, 1965) it has been shown that 
the distributions of verbal associa- 
tions to the words in a subject matter 
(the associative meanings of the 
words) reflect at least in part the 
constraints of the subject matter. 
However, words which represent con- 
cepts in mechanics may be associa- 
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tively related not only in the sense 
that one word elicits the other and 
vice versa, but also in the sense that 
as stimulus words they have re- 
sponses in common. Words with over- 
lapping response distributions have 
associative meanings which are simi- 
lar in some degree and an index of 
this similarity among pairs of words 
is the overlap or intersection coeffi- 
cient (Deese, 1962). 

The structure of a subject matter 
is usually such that its concepts occur 
in more than one logical relation or 
constraint. Both associative meaning 
and associative overlap, however, are 
based upon procedures in which each 
subject (S) gives only a single re- 
sponse to each word on the associa- 
tion test. Once an S produces a re- 
sponse to a word, he is not permitted 
to give other responses to that word 
which may also be readily available 
and which would indieate learned 
associations involving the given word 
in addition to the relation suggested 
by the first response. Allowing each 
S to give only one response to each 
stimulus word also eliminates from 
consideration less available responses 
which may reflect learned relations 
among concepts that are utilized in 
more complex problem-solving tasks 
where the dominant associations are 
not appropriate. 

A modification of associative mean- 
ing which has been adopted in the 
present investigation is one in which 
the associative meaning of a word is 
different for each S and is defined for 
each word as the ordered set of re- 
sponses given by an S to that word 
in a fixed amount of time. The index 
of the degree of associative overlap 
between pairs of words is now termed 
a relatedness coefficient (Garskoff & 
Houston, 1963) and is a function of 
both the responses in common and 
their relative positions within their 
respective response hierarchies. 


Associations learned among words 
as a result of their contiguous occur- 
rence in the constraints of a subject 
matter may relate concepts which 
are, at least for the beginner, quite 
different things. And students learning 
a subject such as Newtonian Me- 
chanics for the first time may make 
judgments about the conceptual simi- 
larity of words such as mass and 
velocity which are quite different 
from determinations made of their 
associative similarity by means of 
verbal association. 

A second measure of psychological 
relation was therefore considered in 
the present investigation in order to 
compare the associative similarity of 
words with a judgment of the simi- 
larity of the concepts represented by 
the words. 


MzrTHOD 


The Ss in this investigation were 24 sen- 
iors, 16 male and 8 female, from the Univer- 
sity of Minnesota Laboratory School. The 
Ss were divided into two groups on the basis 
of their achievement in Newtonian Mechan- 
ics. The 12 Ss whose course grades were m 
the upper one-half of the distribution of 
grades were termed the high achievers and 
the 12 students whose grades were in the 
lower one-half were termed the low achiev- 
ers. Similar proportions of each sex occurre! 
in each achievement group. 7 

A test booklet was prepared containing à 
page of general instructions, & continued 
word association test, and a similarity judg- 
ments test, in that order. The general in- 
structions given to each S on the first page 
of the booklet were as follows: 1 

The following pages contain test items 
designed to find out how people use words 
which represent concepts in physics. There 
are two tests, neither of which has any- 
thing to do with ability in physics, but 
which are intended to reveal facts about 
the usage of physics words not discovera- 

ble by more usual techniques. Treat e 

item on both tests as carefully as you can; 

even when the item calls for a judgment 
that seems almost impossible to make. 

Work rapidly, however, because we 

learn most about how physics words a 

used if you give us the first response whic 

occurs to you. You may open the booklet 
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and begin with the first page. Read the in- 

structions carefully. 

‘As a basis for selecting the stimulus words 
to be used in the present investigation, a fre- 
quency count was made of the words which 
represented concepts in Newtonian Mechan- 
ies in Ss’ textbook (Dull, Metcalf, & Wil- 
liams, 1960). This word count distinguished 
between the occurrence of a word represent- 
ing a concept in running text, and the oc- 
currence of the same word in all other situ- 
ations (e.g, problems, questions, etc.). 
Fourteen words were selected from this word 
count so as to represent as much of the fre- 
quency range as possible, under the restric- 
tion that the list of words include six con- 
cepts whose definitions within the text 
consisted of simple physical equations 
(Johnson, 1965). The frequency range repre- 
sented by the 14 words was from 1.26 oc- 
currences per 100 words of running text for 
the word Force (one of the most frequently 
occurring concepts) to .02 occurrences per 
100 words of text for the word mapuLsE (one 
of the least frequently occurring concepts). 
The rank-order correlation between the fre- 
quency of occurrence of these 14 words in 
running text and their frequency of occur- 
rence in all other situations was 94, p < 
001. The 14 words ranked according to their 
frequency of occurrence are presented later 
in Tables 3 and 4. 

Each of the 14 words appeared 24 times 
on a page in the association test and each 
S's association test contained an individual 
random order of the 14 words. The instruc- 
tions given to each S in the association test 
were as follows: 

This is a test to see how many words 
you can think of and write down in a 
short time. You will be given a key word 
which represents a concept in physics and 
you are to write down as many other 
words which this key word brings to mind 
as you can. No one is expected to fill in all 
the spaces on a page, but write as many 
words as you can which the key word 
brings to mind. Be sure to think back to 
the key word after each word you write 
down because the test is to see how many 
other words the key word makes you 
of. A good way to do this is to repeat the 
key word over and over to yourself as you 
write. You will have one minute on each 
page. I will tell you when to go on to the 
next page. Are there any questions? 

Each of the 14 words on the association 
test was paired with every other word to 
create 91 pairs of words which served as 
stimuli on the similarity judgments test. 
Each pair of words on this test was followed 


by a T-point rating scale, the anchors of 
which were the words similar and dissimilar. 

Two different random orders of the 91 
pairs of words were constructed with the re- 
strictions that (a) no pair of words be ad- 
jacent to the same pairs of words in the two 
random orders and (b) no adjacent pair 
contain the same concepts. Word pairs were 
reversed within each random order so that 
each word was judged with every other word 
both when it appeared before and when it 
appeared after the word. The different or- 
ders of the pairs of words were separated 
within each random order so that no book- 
let contained a given pair of words more 
than once. 

Ten pairs of words were printed on each 
page of the test, booklet, except for one page 
which contained 11 pairs of words. Pages for 
both orders of the pairs of words were as- 
sembled in six different random orders, thus 
creating a total of 24 different forms of the 
judgments test. Six Ss at each level of 
achievement received each random order of 
words, two Ss at each level received each 
random order of the pages, and each S at 
each level received one of the two orders of 
the pairs of words. All randomization for the 
investigation was carried out by means of a 
table of random numbers. 

The instructions given to each S on the 
similarity judgments test were as follows: 

On the following pages you will find 
pairs of words which represent concepts in 
physics. Each pair of words is followed by 

a rating scale. These rating scales are de- 

signed to see how similar or dissimilar you 

feel the concepts represented by the words 
are to one another. If you feel the con- 
cepts represented by a particular pair of 
words are similar in some degree, then 
place an “X” in a blank near the “Simi- 
lar” end of the rating scale. If, on the 
other hand, you feel that the concepts 
represented by the words are dissimilar in 
some degree, place an “X” in a blank near 
the “Dissimilar” end of the scale. The de- 

e of similarity or dissimilarity is in- 

dicated by how far you place your «x? 

from either end of the scale. 1 
The Ss were timed in the association test 
and allowed to proceed at their own speed 
on the judgments test. All Ss finished the 
booklet in one 45-minute class period. 


RESULTS 


The words on the association test 
were divided into two categories ac- 
cording to their frequency of occur- 
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rence in the subject matter; those 
words that occurred with a frequency 
greater than the median frequency 
for the 14 words of .11 occurrences per 
100 words of running text and those 
words that occurred with a frequency 
less than this median frequency. A 
comparison was made between high 
and low achievers based upon the 
number of responses they produced to 
words in these two categories. The 
mean number of responses given by 
each type of S to words in each of the 
two categories is presented in Table 1. 

An analysis of variance with re- 
peated measures on the word count 
factor (Winer, 1962) was performed 
on the number of responses given by 
the high and low achievers to the 
words in these two categories. The 
high achievers gave significantly 
more responses per stimulus word 
than the low achievers (F (1,22) = 
6.79, p < .05), and the high-fre- 
quency words elicited significantly 
more responses from Ss than the low- 
frequency words (F(1,22) = 6.87, 
p < .05). The interaction between 
achievement and  word-count fre- 
quency was not significant. 

One index of the psychological im- 
portance of words is the frequency 
with which they occur as responses on 
an association test. An index of the 
logical importance of the concepts in 
a subject matter is the relative fre- 
quency with which their verbal labels 


TABLE 1 


Mean NUMBER or RESPONSES PER WORD 
ON THE ÁSSOCIATION TEST 


Performance in the subject matter 


posee High achievers Low achievers 

M SD M SD 

High 9.64 | 3.97 | 6.71 | 2.38 

Low 8.86 | 3.84 | 6.24 | 2.84 
no 


occur in the materials used to present 
it. A determination was therefore 
made of the relation between the logi- 
cal and psychological importance of 
words used to represent concepts in 
Newtonian Mechanics. 

The 14 words used as stimuli on 
the association test were ranked at- 
cording to their frequency of occur- 
rence in the subject matter of me- 
chanics. Each of the 14 words was 
also ranked on the basis of how often 
it was given as a response by the high 
and low achievers in the association 
test. The rank-order correlation be- 
tween these two rankings of the 14 
words was .87 (p < .001) for the high 
achievers and .52 (p > .05) for the 
low achievers. 

Not only may words be psycho- 
logically important in the sense that 
they are given as responses to many 
different words, but also in the sense 
that when they occur as responses 
they do so near the top of Ss' response 
hierarchies. Each of the 14 words was 
therefore ranked according to its 
average position within the response 
hierarchies of the words on the asso- 
ciation test. Separate rankings were 
made for high and low achievers and 
these rankings were each correlated 
with the ranking of the 14 words 
based upon their frequency of occur- 
rence in the subject matter. This cor- 
relation was 52 (p > .05) for the 
high achievers and —.05 (p > 09) 
for the low achievers. 

A number of the words on the as- 
sociation test represented concepts 
which are defined by simple equa- 
tion relations in the subject matter 
of mechanics. These relations aè 
FORCE = MASS X ACCELERATION, ACCEL- 
ERATION = VELOCITY /‘TIME, VELOCITY = 
DISTANCE/TIME, WORK = FORCE X DIS- 
TANCE, POWER = WORK/TIME, and M0- 
MENTUM = MASS X VELOCITY. A com: 
parison was made between high and 
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low achievers based upon the number 
of their associations, termed con- 
strained associations, to the nine 
words in the above six equations 
which were consistent with the rela- 
tions specified by the equations. Thus, 
VELOCITY was a constrained associate 
of both ACCELERATION and DISTANCE 
whereas POWER was not a constrained 
associate of either of them. 

Each S's response hierarchy to each 
of the nine words in these six equa- 
lions was divided in half and the 
number of constrained associations in 
each half was tabulated separately. 
The mean number of constrained as- 
sociations in each portion of the re- 
sponse hierarchy for each type of S 
is presented in Table 2. 

An analysis of variance with re- 
peated measures on the response hier- 
archy factor was performed on the 
number of constrained associations 
given by the high and low achievers 
in the upper and lower half of their 
response hierarchies. The high 
achievers gave significantly more con- 
strained associations to these words 
than the low achievers (F(1,22) = 
15.85, p < .001), and there were sig- 
nificantly more associations of this 
kind in the upper than in the lower 
one-half of the Ss’ response hier- 
archies (F'(1,22) = 69.4, p < .001). 
The interaction between achieve- 
ment and hierarchy was also signifi- 
cant (F(1,22) = 144, p < 001). 
That is, there was a greater difference 
in the mean number of constrained 
associations in the two portions of 
the response hierarchy for the high 
achievers than there was for the low 
achievers. 

.In order to determine the associa- 
tive similarity of words in the subject 
matter of Newtonian Mechanics, re- 
latedness coefficients were calculated 
among all pairs of the words in the 
association test. The calculation of 


TABLE 2 
Mean NUMBER OF CONSTRAINED 
ASSOCIATIONS PER 
SrrwuLus Worp 


|Performance in the subject matter 
Ede ro PEI oR ee ae 
di miai High achievers | Low achievers 


M SD M SD 


1.89 | .92 | .69 | .78 
.43 | .60 | .32 | .56 


Upper one-half 
Lower one-half 


these coefficients was based upon the 
total distribution of responses given 
by each S to each of the 14 words 
rather than upon a subset of this dis- 
tribution constructed from logical 
categories in the subject matter. 

A tabulation was also made of 
each S’s similarity judgments to 
each pair of words. Median similarity 
judgments and median relatedness 
coefficients were computed separately 
for high and low achievers. The me- 
dian relatedness coefficients for the 
high achievers are presented above 
the main diagonal in Table 3 and the 
median relatedness coefficients for the 
low achievers are presented below this 
diagonal. 

The median similarity judgments 
were based upon a value of 7 for the 
similar end of the scale and a value 
of 1 for the dissimilar end of the 
scale, These values are presented 
above the main diagonal in Table 4 
for the high achievers and below for 
the low achievers. 

The rank-order correlation corrected 
for ties (Siegel, 1956) between a 
ranking of the 91 pairs of words ac- 
cording to median relatedness coeffi- 
cients and a ranking of the word 
pairs according to median similarity 
judgments was .75 (p < .001) for 
the high achievers and 65 (p < 
.001) for the low achievers. 

The 91 pairs of words were ranked 
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TABLE 3 
MEDIAN RELATEDNESS COEFFICIENTS FOR 
HIGH AcHIEVERS (ABOVE DIAGONAL) AND 
Low AcHIEVERS (BELOW DIAGONAL) 


Stimulus words 

Stimulus words 

1| 2} 3| 4| 5| 6| 7| 8| 9/10/11]12/13]14 
1, FORCE j25/41/24/30|32 09 11/16/34/30/20/84 21 
2. VELOCITY 05| 5508/02/18 /35/31/51/16/15|106/37|22 
3. ACCELERATION |05/39) |29/21/19/32/28/42/18/20 22/25/23. 
4. MASS 27/01/01] | |53/18/04/03/08/11/20/31/24|07 
5. wEIGHT 124/03) 0/47| |17/02/03/04|11/08/18/2407 
6. WORK 138|06|56|07/06)  |26/29/13/55/40/17/24/17 
T. DISTANCE 06/18/04/01/03/13| — |27|30/20/08/0916|17 
8. TIME 101}26/02|01/01)22/26) |23/33|14/05/16/15 
9. SPEED (03/61/47/0101)15/30/20|  |10|11/14/24/21 
10. POWER 36/04/01|/12/11|56/01/02/02|  |41/15/14/17 
11. ENERGY i2 03) '83/01/01/01/32| |23|14|23 
12, INERTIA 126 |09|08/05)17/11/02/13/01|23/24) . |28.24 
13. MOMENTUM 129/25/17/17/1604/07|31/11)1247| |31 
M. IMPULSE j07/01103/03/05/09| 0/01/03/10/06| 0/17) 


Note.—Decimals omitted. 


separately according to the magni- 
tude of the median relatedness coef- 
ficients for the high and low achievers. 
This was done to determine the de- 
gree of agreement in the relative as- 
sociative similarity of the word pairs 
for the two groups of Ss. The rank- 
order correlation corrected for ties be- 
tween these two rankings of the pairs 
of words was .66 (p < .001). 

The 91 pairs of words were also 
ranked separately according to the 
magnitude of the median similarity 
judgment for the high and low 
achievers. This was done to determine 
the degree of agreement in the relative 
judged similarity of the 91 pairs of 
words for Ss in the two groups. The 
rank-order correlation corrected for 
ties between these two rankings of 
the word pairs was .60 (p < .001). 

Each of the words in Tables 3 and 
4 was paired with every other word 
to create a total of 13 pairs for each 
of the 14 words. Rank-order correla- 
tions corrected for ties were then 
computed between median related- 
ness coefficients and median similarity 
judgments across the 13 pairs for 
each of the 14 words. This was done 
to determine the relative differences 


among the 14 words in the extent to 
which associative and judged simi- 
larity were related to one another. 
These correlations were computed 
separately for high and low achievers 
and are presented in Table 5. 

Each S's distribution of relatedness 
coefficients to the 91 pairs of words 
was divided into two categories; those 
coefficients which exceeded the median 
coefficient of the distribution for that 
S and those coefficients which did not 
exceed this median. In a similar man- 
ner, the distribution of similarity 
judgments for each S was divided into 
those judgments that exceeded the 
median judgment for the S and those 
that did not. A phi coefficient was 
then computed for each S based upon 
the two categories of relatedness co- 
efficients and the two categories of 
similarity judgments. This was done 
to determine the degree of relation 
between the associative and judged 
similarity of words representing con- 
cepts in mechanics for each of the 24 
Ss. The values of these 24 coefficients 
ranged from .007 to .457 with 13 
significant at the .05 level (r > .20), 
9 significant at the .01 level (r > .27), 


TABLE 4 
MEDIAN SIMILARITY JUDGMENTS ror HIGH 
ÅCHIEVERS (ABOVE DIAGONAL) AND LOW 
AcHIEVERS (BELOW DIAGONAL) 


Stimulus words 

Stimulus words 

1| 2| 3| 4| 5| 6| 7| 8| 9]10}11/12/13]14 
1. FORCE |49/60/51/55/58|25|29 42152 45/50 49 46 
2. vgLOCITY 0| |55|35|25 2948/58/02 28 45 rey 
3. ACCELERATION 48,57, — 55/57 40 40 58/56 45/45 45/98/37 
4. Mass 403823! (55457 14/14/22 40145 47 65 88 
5. WEIGHT M37 (1863). 45/14 1226/35133 434028 
6. WORK 55/25|30]33/40) | |37|40|28)67 61 3229.2 
T. DISTANCE |23)32|30| 15] 18/35) (8540/35195 90182 4 
8. TIME 20|40|S0|I7 (18.25.25... 4653 35 3018850 
9. SPEED 30,6450 35/25/33(34152| 25138 39 45197 
10. POWER 58/30148/35 97 63|30 92.45|.. |45 998324 
11. ENERGY (52/39/42 25/30157 25/21140 67 . 45145 
12. INERTIA 4830/30/30|48 40/21/25 /45/38 42 ojat 
13. momentom |50]49/55/53/52/35/23|23 53 40 44 50, 
14, IMPULSE 38/4525 35|25|25/24 28 |38|25 45|25 40) 


| 
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TABLE 5 


Rank ORDER CORRELATIONS BETWEEN 
RELATEDNESS COEFFICIENTS AND SIMI- 
LARITY JUDGMENTS FOR Each WorpD 
WITH Aut OTHER WORDS ON THE 
ASSOCIATION TEST 


Performance in the 
subject matter 
Stimulus word 
Hig! Low 
achievers achievers 
L 
FORCE BII -83*** 
VELOCITY 82*** .50 
AOCELERATION E .54 
MASS .82*** .69** 
WEIGHT .90*** .85*** 
WORK 44 -38 
DISTANCE 94v .46 
TIME .80*** .56* 
SPEED S biga .55* 
POWER .67** .55* 
ENERGY .T8** .88*** 
INERTIA .68** .66* 
MOMENTUM .59* aTr? 
IMPULSE .08 27 


and 6 significant at .001 level (r = 
34). 

Finally Ss were divided into two 
groups according to the degree of rela- 
tion between their relatedness coeffi- 
cients and similarity judgments for 
the 91 word pairs. These groups con- 
Sisted of those Ss whose phi coefficient 
exceeded the median coefficient for 
the combined groups and those Ss 
whose phi coefficient did not exceed 
this median. The division of Ss was 
compared with the two-category divi- 
sion of Ss made on the basis of 
achievement in the subject matter. A 
chi-square test indicated no signifi- 
cant relation between achievement 
and degree of relation between asso- 
ciative and judged similarity across 
word pairs representing concepts in 
the subject (x? = .68, p > .05). 


Discussion 


Words which represent concepts in 
physics are related to one another by 
constraints inside as well as outside 
the domain of the subject matter. One 
form of relation which may exist 
among words due to their contiguous 
occurrence in such constraints is that 
of association. 

An examination of the associative 
relations among words which repre- 
sent concepts in Newtonian Me- 
chanics indicates that those words 
which occurred frequently in the 
written materials were more meaning- 
ful for both high and low achievers 
than words which occurred infre- 
quently. In addition, both frequently 
and infrequently occurring words were 
more meaningful for high achievers 
than they were for low achievers. 

The occurrence of words in me- 
chanics as responses on an association 
test was also related, for high 
achievers, to the relative frequency 
with which these words occurred in 
the subject matter. The relative order 
in which words in mechanics oc- 
curred within the response hierarchies 
of other words in the subject matter 
was not, however, related to the fre- 
quency of occurrence of the response 
words in the written materials of 
mechanics for either high or low 
achievers. 

One form of subject-matter con- 
straint which serves to define con- 
cepts in mechanics is the physical 
equation. This form of constraint 
was represented in association more 
frequently for high achievers than for 
low achievers. To the extent that rela- 
tions appeared in association which 
were consistent with such constraints, 
they occurred early in Ss’ response 
hierarchies, especially for the high 
achievers. 

Judgments of the similarity of the 
concepts in Table 4 were related to 
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the associative similarity of the words 
representing these concepts as indexed 
by the coefficients in Table 3, and the 
degree of this relation was high for 
both high and low achievers. Fur- 
thermore, the relative associative sim- 
ilarity among the pairs of words in 
Table 3 was similar for the two 
groups of Ss, as was the relative 
judged similarity of the concepts in 
Table 4. 

The data presented in Table 5 indi- 
cate that the degree of relation be- 
tween associative and judged simi- 
larity across the word pairs involving 
a single concept depended upon both 
the concept and the group of Ss con- 
sidered. The highest degree of relation 
for the high achievers oceurred in 
those pairs in which one of the con- 
cepts was DISTANCE. For the low 
achievers, the highest degree of rela- 
tion occurred for the pairs involving 
the concept ENERGY. The three con- 
cepts for which there was least agree- 
ment among the high achievers as to 
the degree of relation between associ- 
ative and judged similarity were 
WORK, ACCELERATION, and IMPULSE, 
while the three concepts for which 
there was least agreement among the 
low achievers were WORK, DISTANCE, 
and rMPULSE. The concepts for which 
there was the largest difference be- 
tween the two groups in favor of the 
high achievers were vELOCITY, SPEED, 
DISTANCE, TIME, and MASS. 

Students begin their formal study 
of physies with learned associations 
among the words used to represent 
many of its concepts due to the verbal 
environments in which these words 
have occurred outside the domain of 
the subject matter. Within the sub- 
ject matter these words are related to 
one another and to other words by 
formal constraints which take the 
form of equations and definitions. 
The learning of these constraints pro- 


vides additional associations which, 
together with the previously learned 
relations, form the basis for the asso- 
ciative similarity among pairs of 
words. 

Judgments of conceptual similarity 
are also influenced by constraints both 
inside and outside the domain of the 
subject matter and associative and 
conceptual similarity may covary due 
to a mastery of the verbal relations 
which appear in these constraints. 
These judgments may, on the other 
hand, be unrelated to associative 
similarity either because the con- 
straints are unavailable to Ss at the 
time of testing or because the associa- 
tive relations representing the con- 
straints are selected from one domain 
while judgments of conceptual simi- 
larity are based upon constraints in 
another. 

Although in the present investiga- 
tion the general degree of relation 
between associative and judged simi- 
larity was high for both high and low 
achievers, there was considerable 
variation between the two groups of 
Ss for individual concepts in the sub- 
ject matter. The largest differences, 
which were in the direction of increas- 
ing competence, were for the opera- 
tionally defined concepts of MASS, 
DISTANCE, and TIME, either as they 0c- 
eurred separately with other concepts 
or as they occurred in combination 
with one another in the constraint 
VELOCITY = DISTANCE/TIME. 

Among the concepts in Newtonian 
Mechanics, operationally defined con- 
cepts have a particular status in that 
they alone serve to represent meas- 
urable properties of objects in a stu- 
dent’s environment. One result of mas- 
tering mechanics is that such concepts 
can be dealt with in terms of the sub- 
ject-matter constraints in which they 
occur rather than in terms of their 
more general environmental meanings 
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determined by past experience. Per- 
haps knowledge of the structure of a 
subject such as Newtonian Mechan- 
ies would proceed more rapidly than 
it does if more of the words which 
represent its concepts were not re- 
lated to outside patterns of usage but 
were instead learned simply as they 
are related to one another by means 
of the formal constraints in the sub- 
ject matter. 
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FREE-OPERANT PREFERENCE FOR ONE OF 
TWO STORIES: 
A METHODOLOGICAL NOTE 


THOMAS C. LOVITT 
University of Washington 


2 recorded stories were simultaneously presented to 7 12-yr.-old 
children. By programming these stories by conjugate reinforcement 
procedures S was able to select continuously the story of his choice by 
either pressing or not pressing a hand switch. These hand-switch re- 
sponses were graphically recorded, thus enabling E to determine not 
only the individually preferred story, but the quantitative amount this 
option represented. The advantages of the conjugate technique in ob- 
taining quantifiable preference data and future educational involve- 
ment of the conjugate procedure are discussed. 


Previous investigations have re- 
vealed that by conjugately or continu- 
ously offering a stimulus, either visual 
or aural, a subject's (S's) interest at 
any given time could be measured. 
Lindsley (1962), in analyzing the vis- 
ual narrative effects of specified tele- 
vision programs, demonstrated that, 
for his experimental Ss, rerun movies 
maintained a more rapid and consist- 
ent rate of response than did programs 
especially designed for television view- 
ing. In a study involving the compara- 
tive effects of five classes of verbal 
narration, Lovitt (1965) reported that 
although a conventional story gener- 
ated higher and more consistent re- 
sponding to acquire the narration, 
other and more abstruse verbalizations 
were more desirable than silence. 

Based on the evidence of these two 
investigations, which compared visual 
or aural narration to no stimulation, 
an experiment was designed to con- 
trast the reinforcing effects of two 
simultaneously offered stories. In ad- 
dition to obtaining each individual’s 
operant preference for either story, 
information would also be gathered 
relevant to his verbal choice. This 
latter concern was prompted by 
Morgan and Lindsley (1964), who 
compared individual operant and ver- 
bal preferences and reported that these 


dimensions of preference were not 
always synonymous. Their study, 
which involved the simultaneous offer- 
ing of monophonic and stereophonic 
music, demonstrated that while ac- 
tively consuming the stimuli some 
Ss preferred stereo music, whereas 
others demonstrated no preference. 
However, after the session, when the 
stimuli were no longer available, all 
Ss verbalized a choice for stereophonic 
music. 


Mernop 


Subjects 


The participants in this study were five 
boys and two girls, all 12 years of age. All 
of the Ss were enrolled in either seventh- 
or eighth-grade classes in Northeast Johnson 
County, Kansas. 


Procedure 


All of the Ss participated in one experi- 
mental session lasting from 30 minutes to 1 
hour. During this session, each child sat in a 
partially sound-attenuated room furnished 
with only a chair, a set of headphones, and 
a hand microswitch. He was told only that 
he would be listening to stories and that he 
could press the switch if he so desired. 
Nothing was said about the reinforcement 
contingencies—whether it was or was not 
necessary to press the switch, or if so, at 
what rate. 

The microswitch, when pressed, produced 
a brief electric charge that affected the 
narrative intensity. Even if the switch was 
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pressed and held down, only this brief in- 
tensity change resulted. The response de- 
finer converted each of these microswitch 
presses into electrical pulses that in turn 
operated the conjugate reinforcer. 

Two stories, which had been previously 
recorded by the investigator, were pro- 
grammed by the conjugate-servo. They were 
begun at the same time and ran continu- 
ously throughout the experimental session. 

The stories available to each S were The 
Incredible Journey by Sheila Burnford 
(1961) and White Falcon by Elliott Arnold 
(1955). Both stories have been used fre- 
quently in classrooms and homes for 12-year- 
olds and have received such commendation 
as reinforcing literature for the preteen as 
being presented the William Allen White 
book award. 

In addition to their suitability for pre- 
adolescents, these stories met another cri- 
terion to encourage the factor of selection 
—they offered both constant narrative action 
and a certain amount of content redundancy. 
The former was a necessary ingredient, in 
that to present simultaneously two stories 
in a measure of preference, it is imperative 
that the narrative content of both stories 
be fairly equally reinforcing throughout. 
The matter of redundancy was an equally 
important aspect in that during some ex- 
perimental contingencies the preferred story 
would be unobtainable, thus leaving & gap 
of a few minutes in its presentation. 

In order to attain the desired story at the 
preset maximal volume (which was set at 
a comfortable level) during experimental 
accelerating sessions, the S had to respond 
at a rate of 45 responses per minute. If he 
responded at a rate lower than this, the 
story gradually faded, and he was presented 
with portions of both stories at minimal 
volumes. If S did not respond at all, he was 
granted the alternate story, which was pro- 
grammed by decelerating contingencies. 

The basic sequential order of each session 
consisted of the following segments: 


Jio WA WaT eT WO 
SE el i Sep SE a0. S 
w J Ww Ted Wootal 


1 2 3 4—5 6 


The J was used as the code letter for 
Incredible Journey; the W to denote White 
Falcon. The top letter indicates accelerating 
conditions, under which S must emit a re- 
quired response rate in order to obtain the 


stimulus at maximum volume. The bottom 
letter indicates decelerating conditions, 
whereby S need not respond to acquire that 
stimulus. In Segments 3 and 4 (control con- 
ditions) S was granted either White Falcon 
or Incredible Journey, regardless of respond- 
ing, not responding, or rate of response. He 
was unable to avoid the stimulus during 
these conditions. 

Each listener’s operant response rate was 
recorded on a cumulative response recorder. 
This equipment is comprised of a roll of 
paper moving continuously at 30 centimeters 
(about 12 inches) per hour and a recording 
pen that records each handswitch response 
by an upward movement. After 500 of these 
recorded responses, the pen automatically 
resets and the moving process is repeated. 
The slope of the line on these cumulative 
records indicates the rate of pressing the 
handswitch, a steep slope indicating rapid 
responding and a smooth slope indicating 
continuous responding. These cumulative 
records enable the experimenter to measure 
directly the listener’s moment-to-moment 
interest, or his desire to listen to either story. 

At the conclusion of the session each S 
was asked to verbalize his preference. This 
stated preference was then compared to his 
operant choice. 


RESULTS 


The upper segment of Figure 1 is a 
cumulative record of an S who evi- 
denced an operant preference for J 
over W. This S, referred to as S 1, was 
the only child in this experiment who 
expressed a preference for, and oper- 
ated to attain, J. 

Her operant preference for J was 
unequivocal in that she responded to 
acquire this narrative form at a high 
and consistent rate during Segments 1 
and 5 when J was programmed as an 
accelerating consequence. She further 
evidenced her J option by not respond- 
ing when J was programmed as a de- 
celerating consequence during Seg- 
ments 2 and 6, Final confirmation of a 
J choice was demonstrated during Seg- 
ment 4, when she accepted J without 
responding, and during Segment 3, 
when she performed in a searching or 
probing manner in a vain attempt to 
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3 
E2— L8) m. 
Fic. 1. The upper segment demonstrates 
S 1’s operant preference for J. (The center 
graph depicts S 2’s operant preference for 
W. The lower segment demonstrates S 5's 
slight operant preference for W.) 


be granted the more favored, but un- 
obtainable, J. 


S 2's operant preference for W over 
J was representative of the perform- 
ance of four other Ss. The center por- 
tion of Figure 1 is a cumulative record 
of his response behavior. 

Although S 2's response rate to ac- 
quire his favored stimulus was lower 
than that of S 1 to acquire her favored 
narration, it can be noted by compar- 
ing their records that their response 
rate patterns are complete opposites. 
During Segments 1 and 5, when W 
was programmed as a decelerating 
consequence, S 2 refrained from re- 
sponding, accepting W without effort. 
However, when W was programmed as 
an accelerating consequence, during 
Segments 2 and 6, he met the contin- 
gency requirement by responding be- 
yond the minimal setting of 45 per 
minute. During Segment 3, a control 
segment, where W was programmed as 
both a decelerating and accelerating 
consequence, S 2 accepted the “free” 
offering and did not respond. Final 
confirmation of a W choice was indi- 
cated when he responded in a search- 
ing, yet futile, manner during Segment 
4 to obtain his favored W. During Seg- 
ment 4, S 2’s irregular and intermittent 
response pattern was similar to that of 
several other Ss in this research dur- 
ing the nonpreferred control segment, 
since at this time the favored narra- 
tion was unobtainable. 

The lower portion of Figure 1 is the 
cumulative record of S 5, the only 
S who did not perform in such a way 
that a definitive operant selection 
could be ascertained. Although S 5 
verbally expressed a preference for W 
over J, this fact was not totally sub- 
stantiated by his ongoing operant 
choice. Observation of his cumulative 
record reveals that S 5 did some re- 
sponding during all experimental seg- 
ments, including the control portions 
(Segments 3 and 4), where responding 
of any type was completely unproduc- 
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tive. If S 5's response rates during the 
final two segments (5 and 6) are com- 
pared, some slight preference for W 
is noted in that he worked at 78 per 
minute to attain J and 130 to ac- 
quire W. However, if the response 
rates during Segments 1 and 2 are 
compared, an alternate interpretation 
may be made, in that he responded 
more for J during Segment 1 than for 
W during Segment 2. 

As a group the data revealed that 
five Ss operated to acquire W, one to 
obtain J, and one to be granted por- 
tions of each. 

When verbal preferences were com- 
pared to operant choices, five con- 
sumers had synonymous reactions; 
whereas in two instances the operant 
and verbal choices were not in agree- 
ment. S 5 had expressed a preference 
for W and operated to acquire portions 
of J as well as W; while S 6 responded 
consistently to obtain W, but verbal- 
ized no particular preference for either 
story. 

In view of the fact that in this study 
all but two of the Ss’ verbal and oper- 
ant preferences coincided, obtaining 
verbal options alone would have been 
a reasonably valid indicator of pref- 
erence. One might further speculate 
that for this sample of children their 
ongoing, or operant, preference was 
virtually the same as their verbalized 
choice made after the narration had 


been consumed. However, this stated 
or verbalized choice could be useful 
only if an either-or dimension of data 
is sufficient. On the other hand, the 
continuous operant preference, since 
it has a numerical equivalent, enables 
the experimenter to make quantifiable 
statements pertinent to the relative 
interest of one story as compared to 
another. 

Since the effectiveness of the conju- 
gate methodology, insofar as continu- 
ously measuring individual aesthetic 
preference is concerned, has been dem- 
onstrated, implications for further ap- 
plication of this technique seem rather 
obvious. Not only might this tactic 
prove useful in an educational setting, 
but future research may validate its 
effectiveness in broader areas of com- 
munication and experimental esthetics. 
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SOCIAL SCHEMA OF NORMAL AND DISTURBED 
SCHOOL CHILDREN’ 


RHODA LEE FISHER? 
Syracuse Scholastic Rehabilitation Program, Syracuse, New York 


The Kuethe social schema technique significantly differentiated be- 
tween emotionally disturbed and normal boys of elementary school 
age. The difference was such that the disturbed boys placed greater 
distance between figures in the social schemas. The degree of distance 
the disturbed boys place between the figures was positively correlated 
with the amount of hostility shown by their mothers, Mother's hos- 
tility was measured with the Buss-Durkee Hostility Scale. It would 
appear that children with hostile mothers conceive of social relations 
as more distant than children whose mothers are less hostile. 


The present studies concerned them- 
selves with a group of children whose 
deportment was so disruptive that 
teachers could not exercise sufficient 
control over them to permit them to 
remain within the normal classroom. 
The inability to behave in an ac- 
ceptable socially approved fashion 
seems to be a critical problem for 
them. A measure of social distance 
has been developed by Kuethe (1962) 
which schematically represents the 
individual’s concept of social inter- 
action. With this technique he found 
that 


individuals who employ a specific social 
schema in organizing behavior in one situa- 
tion will employ the same schema in quite 
different situations where the one common 
denominator is social stimuli of the same 
content even though the physical form of 
the stimuli may be completely changed 
[Kuethe, 1964, p. 24]. 


Within the context of the application 


*This study was partially supported by 
National Institutes of Health Grant MH- 
01475. 

* Acknowledgement and thanks are ex- 
pressed to the school principal, Frank Liss, 
and to teachers W. Liston, T. Clift, and E. 
Feeley, all of the Syracuse public schools, 
whose cooperation facilitated the collection 
of this data. 


of the social schema technique, the 
following objectives were set: 

1. To compare the social schema of 
normal children with those character- 
izing children with school behavior 
problems. It was anticipated that 
those showing disruptive behavior in 
the classroom would portray interac- 
tion schema differently than would 
normally behaving children. More 
specifically it was expected that the 
disturbed children would experience 
themselves as having fewer meaning- 
ful ties to others—as more alienated 
and without meaningful roles. That is, 
they feel separated and distant from 
others. 

2. To relate distances set by chil- 
dren between human forms and the 
degree of hostility characteristics of 
their mothers. It was hypothesized 
that the more angry and aggressive 
the mother the more distantly would 
her child structure social relations. 
That is, one would expect that the 
angry mother would be perceived as 
repelling and pushing others away. 
She would exemplify a style of social 
relationship based on distance and 
unfriendliness. 

3. To compare a group method of 
administration of social distance 
schema with an individual method of 
administration. 
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Srupy I 


Method 


Subjects. One group of subjects (Ss) 
consisted of 32 white male elementary 
school children in special classes designed 
for those unable to adjust to the regular 
school demands. These children were too 
difficult for the regular classroom teacher 
to control. The mean age of the group was 
95 (range 6-13). A second group of Ss was 
comprised of 45 white male elementary 
school children from the regular fifth- and 
sixth-grade classrooms; and a third group 
included 44 white female elementary school 
children from the regular classroom. The 
mean age for both normal groups was 11.2 
(range 10.0-13.0). 


Procedure 


An adaptation of a technique developed 
by Kuethe (1962) for the measurement of 
social schemata was used. Instead of 
Kuethe’s usual felt figures with directions 
to place them on a large felt background 
mounted on a wall, the instructions were to 
glue the figures on an open field (in a 
booklet of paper). This shift in procedure 
enabled the series to be administered on a 
group basis. Specifically, each sample of Ss 
placed eight groups of figures on a series of 
fields according to Kuethe’s (1962) “free re- 
sponse instructions.” The figures for the 
group series consisted of yellow cutouts on 
gum-backed paper varying in height from 
2% inches (the baby figure) to 5⁄2 inches 
(the man figure). Each S was provided 
with a booklet consisting of eight pages of 
inch by 22-inch paper. An envelope con- 
taining figures to be placed on each page 
was distributed to each S. Not until every- 
one had completed his arrangement of the 
figures on a given page were instructions 
given to turn to the next page; and the 
next envelope was distributed. A description 
of the figures contained in each envelope 
follows: 

Envelope 1: two boys. 

Envelope 2: a girl, à woman. 

Envelope 3: a woman, a man. 

Envelope 4: two boys, one girl. 

Envelope 5: a man, a girl. 

Envelope 6: a man, a boy. 

Envelope 7: a boy, a girl. 

Envelope 8: a boy, a girl, a baby. 

Scoring consisted of measuring the dis- 
tance between each two figures. The score 
derived was the average distance for the 
entire eight situations. 


Results and Discussion 


The average distance between the 
figures arranged in schemas was sig- 
nificantly different for the disturbed 
and the normal samples. Normal boys 
arranged the human figures more 
closely together than did the disturbed 
boys. Mean total average distance for 
the normal boys was 1.6 inches; mean 
total average distance for the group 
of disturbed boys was 2.3 inches (t = 
2.9,p < 01). 

A comparison between the normal 
boys and the normal girls failed to 
reveal a real difference. The total 
average distance of the figures as 
grouped by the girls was 1.6 inches. 

Since no sex differences were appar- 
ent between the normal boys and the 
normal girls it seemed logical also to 
compare the normal girls with the dis- 
turbed boys. When this was done, 
a significant difference was once again 
found (t = 2.3, p < .05). 

Another analysis in which age was 
controlled indicated that age exerted 
no influence on the distance between 
figures as represented in the social 
schemata. Children below the age of 
10.0 were eliminated from the dis- 
turbed group. Children above the age 
of 11.0 were eliminated from the 
normal group of boys. The mean age 
of both groups for this analysis was 
10.4 years. Mean total distance be- 
tween figure placement for the normal 
boys (N = 12) was 1.4 inches. The 
closeness of these values to those for 
the total samples indicated that age 
did not influence the perception of 
social distance in the present experi- 
mental context. 

As predicted, normal children 
placed figures with human character- 
istics closer together than did chil- 
dren with serious school problems. In 
terms of Kuethe’s work this suggests 
that the normal children feel closer 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS OF THE 
NonMAL AND DISTURBED CHILDREN ON 
THE GROUP ADMINISTERED FORM OF 
THE SOCIAL SCHEMATA TECHNIQUE 


Subjects 


Disturbed boys 
Normal boys 
Normal girls 


and more related to others than do 
the disturbed children. 

"In a paper recently published Wein- 
Stein (1965) also reports that the 
Kuethe schema placements discrimi- 
nate disturbed and normal boys. The 
figures and felt size used by Weinstein 
were not directly comparable to those 
employed in the present study. In her 
study a group of 20 disturbed boys 
admitted to a special residential school 
for emotionally disturbed children was 
evaluated. A group of 20 normal con- 
trol children from a nearby public 
school matched in age and IQ was 
used as a comparison group. Results 
indicated that disturbed boys placed 
figures that represented mother-child 
relationships at a relatively greater 
distance from each other than did 
normal boys. A second finding indi- 
cated that disturbed boys, as com- 
pared to normal boys, visualized hu- 
man relationships as involving more 
intervening distance than is true of 
relationships between abstract geo- 
metric forms. Overall the Weinstein 
data and those from the present proj- 
ect, while not specifically supportive 
of each other, certainly indicate in 
common a trend for disturbed boys to 
conceptualize human relationships as 
distant. 


Srupy II 
Method 


For this study only those boys in the 
special disturbed classes were used. It was 
the purpose of this project to investigate 
relationships which might exist between the 
hostility attributes characterizing the 
mother and the closeness-distance positions 
assigned to human figures on the Kuethe 
schemas by the children. Since aggression is 
such a prominent dimension in the behavior 
of the children it seems an important 
dimension to investigate in the mothers as . 
well. To this end, five subscales of the Buss- 
Durkee Hostility Scale (Buss, 1961, pp. 171- 
172) were administered to each mother. 
These five scales measured Assaultiveness, 
Irritability, Negativism, Verbal aggressive- 
ness, and Guilt. The S was asked to respond 
with a true or false answer to 48 different 
statements regarding the expression of 
anger. Examples of the Assaultive aggressive 
subscale are “Once in a while I cannot 
control my urge to harm others,” “People 
who continually pester you are asking for a 
punch in the nose,” “I have known people 
who pushed me so far that we came to 
blows.” Examples of the Irritability scale 
are “If someone doesn’t treat me right, I 
don’t let it annoy me,” “Sometimes people 
bother me just by being around,” “I can't 
help being a little rude to people I don't 
like.” The Negativism scale includes state- 
ments such as “Unless somebody asks me in 
a nice way, I don’t do what they want," 
“When I disapprove of my friends’ be- 
havior, I let them know it,” “I can't help 
getting into arguments when people disagree 
with me” are examples of the Verbal ag- 
gression subscale. The Quilt subscale in- 
cludes statements such as “I sometimes have 
bad thoughts which make me feel ashamed 
of myself,” “I am concerned about being 
forgiven for my sins.” 

An individual form of the social schemata 
was used in this experiment with the dis- 
turbed group of boys. The individually ad- 
ministered form consisted of seven groups 
of figures cut from yellow felt, The figures 
were derived from Kuethe’s (1962, pp. 32- 
33) original patterns and Ss were instructed 
to arrange each group of figures on a black 
felt field 36 inches by 38 inches that was 
stretched on a wall of the experimental 
room. This series consisted of the following 
groups: 

1. Two children and à square. 

2. Two children and a woman. 

3. Two children and a man. 


aem 
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4, A man, a woman, a square, and a large 
rectangle. 

5. A man, a woman, a child. 

6. Six children. 

1. A child, a book, a circle 


Results and Discussion 


Four scores were obtained from the 
individually administered technique: 
the total average distance between 
all of the figures, the average distance 
between all of the adult figures, the 
average distance between all of the 
child figures, the average distance be- 
tween adult and child figures. 

Two measures from the Buss- 
Durkee Hostility scale were positively 
related to the distance placement be- 
tween figures in the social schemata. 
The  Buss-Durkee ^ Assaultiveness 
scores of the mothers were positively 
correlated with the childrens’ indi- 
cated distances between adult Kuethe 
figures (r = .57, p < .01, N = 25). 
The mothers’ Irritability scores were 
positively correlated with the dis- 
tances between adult Kuethe figures 
(r = .53, p < 01, N = 25). The total 
average social schemata distances 


TABLE 2 
ConnzLATION or MorHER's Buss-DURKEE 
Hosmitrry Scan Scores WITH VARIOUS 
Measures or KvuetTue’s SOCIAL 


ScHEMATA 
Individually tested | G30 
Buss-Durkee | Average|Average| Total | Total 
adult child | average| average 
distance|distance| di distance 
M = = | (M = (M = 
29 | 31) | $0 23) 
Direct, aggres- 
sion (M = 
3.6) 57*** 
Irritability 
(M = 5.2) |.53***| .37* |.55*** 
Negativism 
(M = 1.8) — .68*** 
*p = 5. 


"p = Ol: 


TABLE 3 
CORRELATIONS BETWEEN THE GROUP AD- 
MINISTERED AND INDIVIDUALLY ADMIN- 
ISTERED FoRMs OF THE SOCIAL 
SCHEMATA TECHNIQUE 


Group administered form 
adi, | rom | Arene 
reduc between 
distance figures 
(M = 2.3) (M = 23) 
Average distance be- 
tween adult figures 
(M = 29) A3** .97* 
Average distance be- 
tween adult and 
child figures (M = 
2.9) .09 .08 
Average distance be- * 
tween child figures 
(M = 3.1) .05 .03 
Total average dis- 
tance (M — 3.1) .18 .12 
E1559 EEE PREA N egi set i 
*p = .05. 
++p = 02. 


were also positively correlated with 
the mothers’ Irritability score (r = 
58, p < 01, N = 26). Mothers’ Irri- 
tability scores were also positively 
correlated with the distances between 
the child figures (r = .37, p = .05, N 
= 26). 

Generally then, children who place 
human figures at a relatively large 
distance from each other have moth- 
ers who are depicted as angry and 
hostile. 


Sru»x III 


Method 


A comparison of the two forms (indi- 
vidual and group) of administration of the 
social schemata technique was the purpose 
of the third study. The children in the 
special classes for the disturbed were ad- 
ministered both versions of the Kuethe 
procedure, Though the situations for each 
of the series are not completely parallel, 
they seemed sufficiently similar to make 
such an analysis worthwhile. 


92 Ruopa Lee FisneR 


Results and Discussion 


Distance between adult figures for 
the individually-administered sche- 
mata was significantly related to the 
equivalent distance between figures 
for the group-administered schemata 
(r = .43, p = .02, N = 30). 

Distance between adult figures for 
the individually-administered sche- 
mata was significantly related to dis- 
tance between child figures for the 
group-administered schema (r = 37, 
p = .05, N = 30). 

There is evidence indicating that a 
similar dimension is tapped by both 
versions of the Kuethe technique. 


CoNcLusioNs 


Children with problems in adjust- 
ment to the classroom framework 
place human figures at a significantly 
greater distance apart than do chil- 
dren who are able to adjust success- 
fully to the classroom. In terms of 
Kuethe’s findings (1962), this indi- 
cates that the disturbed children feel 
relatively distant or estranged from 
others. Whether this sense of distance 
is produced by the negative conse- 


quences of their behavior or actually 
plays a previous role in contributing 
to their poor adjustment remains to be 
determined. There is the possibility 
that the degree of distance felt by a 
child between himself and others 
arises from socialization experiences, 
This is suggested by the fact that a 
significant positive correlation was 
found between Kuethe scores and 
measures of mother hostility in the 
disturbed group. Structuring one’s re- 
lations in a “distant” fashion may be 
a consequence of dealing with a 
mother who angrily pushes him away 
or perhaps of needing to remain far 
enough away from mother to avoid 
her aggression. 
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Interview and card sort data were obtained on 261 high-, average-, 
and low-achieving bright 8th-grade boys in an attempt to confirm or 
refute earlier findings concerning identification patterns, motivation, 
and values. Confirmation was generally found, the main exception 
being that in the present study most Ss identified with fathers whereas 
in the previous study high achievers were most likely to do so. Socio- 
economic status bias may have influenced earlier data. Low achievers 
were found more motivated than others to affiliate with peers; high 
achievers were more motivated academically. Low achievers were 
n more noncomforming, whereas high achievers were more independent. 
School was seen by most Ss as demanding conformity, and Ss accepted 
this role model. Scholarship was shown to have little relationship to 
peer popularity, and the perceived norm for school achievement was 


that of mediocrity. 


Identification may be defined as the 
process of affiliation with one or more 
other persons, groups, or institutions, 
which tend to become models. Atti- 
iudes, values, and other behavior are 
imitated, and may be internalized by 
the imitator. 

Identification is closely related to 
motivation, since one tends to identify 
with those who provide him suitable 
reinforcers, and, on the other hand, 
reinforcers become so partly because 
they are dispensed by those with whom 
one identifies. 

_An extended discussion of identifica- 
tion and imitation is provided by 
Bandura and Walters (1963). Ger- 
mane to the present study is the gen- 
erally accepted belief that, in early 
childhood, both boys and girls tend 
to identify most closely with the 
mother. After early childhood, how- 
ever, the small boy tends to identify 
with the father and to accept him as 


+The research reported herein was sup- 
ported by the Cooperative Research Pro- 
gram of the United States Office of Educa- 
tion, United States Department of Health, 
Education, and Welfare (Project 8-035, 
Final Report 1965). 


a role model; this transition usually 
takes place by the time the boy 
reaches school age. 

In developing his system of values, 
attitudes, and motives, the child is 
exposed to other adult models (such 
as teachers), and to models in the peer 
group. He rarely reproduces all of the 
attributes of any given model, but se- 
lects certain ones from each. Accord- 
ingly, if identifying figures have many 
common attributes, there will be a 
corresponding reinforcing effect and 
the child's behavior should strongly 
reflect the consensual elements. If, 
however, identifying figures portray 
diverse behaviors, the child’s own at- 
tributes may vary with the influence 
of the identifying figures. 

In relation to motives to achieve in 
school, a number of possibilities exist. 
The boy may or may not identify 
closely with his father, or with both 
parents. The parents may or may not 
value school achievement and rein- 
force achievement striving. The boy 
may or may not identify with his 
teachers. The peer group with which 
he identifies may or may not value 
school achievement. Depending, there- 
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fore, on the identification patterns of 
the boy, the values of his models, those 
he has internalized, and his perception 
of school, motivation to achieve will 
vary. It is expected that these varia- 
bles will be related to differences in 
academic achievement of children 
whose abilities and cultural back- 
grounds are such as to permit above- 
average school achievement. 

In the present study no formal hy- 
potheses were proposed, but questions 
were raised concerning the extent of 
the subjects’ (Ss’) identification with 
fathers, teachers, and peers, and with 
acceptance of school values, as these 
variables related to school achieve- 
ment. 

The research reported here is part 
of a larger study (Ringness, 1965c). 
It partially replicates an earlier study 
(Ringness, 1963, 1965a, 1965b) and 
is an attempt to confirm or refute pre- 
vious findings. Certain differences in 
sampling and instrumentation were in- 
troduced so that the two studies are 
not entirely comparable. 

The earlier research employed 
matched pairs of ninth-grade boys, 
half of whom attained a grade-point- 
average (GPA) of 3.00 or above 
(based on a system where A = 4.00) 
and half of whom attained a GPA 
of 2.00 or below. The remarks of 
Thorndike (1963), among others, sug- 
gest that failure to include the middle 
range of achievement might neglect 
possible curvilinear relationships 
among data; the present study at- 
tempts to compensate for that omis- 
Sion. 

Further, the earlier research em- 
ployed a sample of Ss from an upper- 
middle socioeconomic status (SES) 
population. Since parents and youth 
from this SES may be more oriented 
toward school achievement than those 
from other SES classes, there may 
have been a bias in the earlier data; 
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the present study embodies Ss from all 
SES classes. 

Girls were not studied in either re- 
search, primarily because boys show 
poor school achievement approxi- 
mately four times as frequently as 
girls, hence seem to constitute a more 
pressing problem. In addition, various 
studies (e.g, McGuire, Hindsman, 
King, & Jennings, 1961) have shown 
Sex-specifie factors related to school 
achievement. 


RELATED LITERATURE 


"This study does not assume general- 
ized motives to achieve (e.g., McClel- 
land's, 1953, achievement motive). 
Rather, it is considered that behavior 
related to school achievement is 
related to values possessed and rein- 
forcements offered by identifying fig- 
ures. These may vary among them- 
selves, with resultant influences on 
motives to achieve in school. 

A number of studies have been con- 
cerned with the relationships of aca- 
demic achievement to parent-child re- 
lationships. Taylor (1964) surveyed 
Studies of personality traits related 
to discrepant school achievement. Em- 
ploying the terms *overachiever" and 
“underachiever,” he found general 
support for the belief that the over- 
achiever accepts authority and has 8 
good relationship with his parents. The 
parents tend to be supportive of their 
children’s academic efforts and the 
children try to please them by doing 
well academically. Morrow and Wil- 
son (1961) and the Portland Public 
Schools (1959) found that families of 
high achievers were more supportive, 
less authoritarian, and more permis- 
sive than families of low achievers; 
presumably such relationships encour- 
age children's independence and striv- 
ing behaviors. 

However, the dynamics are not en- 
tirely clear, for Drews and Teahan 
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(1957) found that domineering moth- 
ers tend to foster high achievement in 
their children, and Taylor notes that 
some studies suggest that overachiev- 
ers may be compensating for a lack 
of love, warmth, or understanding at 
home. In the latter event, identifica- 
tion with the home might be lacking, 
but identification with the school 
might be present. 

In contrast to high or overachievers, 
low or underachievers tend to react 
against the home and school, and pro- 
vide more problem behavior for au- 
thorities (Frankel, 1960; Gowan, 
1955; Ringness, 1963). Taylor found 
underachievers to be usually regarded 
as hostile and aggressive toward au- 
thorities, and confiict with parents 
seemed to be carried over to authority 
figures outside the home. Bandura and 
Walters’ (1959) work on adolescent 
aggression tends to confirm this be- 
lief, and supports the notion that for 
boys, the father-son relationship may 
be the most important determiner. 
The weight of evidence suggests that 
identification with the home, espe- 
cially with the father, may be closely 
related to the school achievement of 
boys. 

Motive to affiliate with peers has 
been studied extensively. Ringness 
(1963) found that low achievers were 
more motivated to affiliate with peers 
than were high. achievers, in contrast 
to the family orientation and engage- 
ment in family activities characteristic 
of high achievers. Taylor’s review 
found that overachievers were more 
independent, leaderlike, responsible, 
and dependable than underachievers; 
the latter tended to identify with a 
peer group, from which they gained 
support. Peer group affiliation was 
seen as a means of gaining some emo- 
tional security and as a way of re- 
ducing anxiety for underachievers. 

McGuire et al. (1961) tend to con- 


firm the notion that low achievers are 
more dependent on peer group sup- 
port, but there is an implication that 
low achievers are not well accepted by 
the peer group at large. In a factor 
analytic and multiple regression study 
of predictors of talent, two independ- 
ent peer-related factors were isolated. 
“Peer Stimulus Value” (PSV) was de- 
fined as a positive response to pres- 
sures imposed by age-mates. Children 
high in this factor were regarded as 
models by their peers; they were ac- 
tive, accepted, self-confident, and ef- 
fective. A person high in the second 
peer-related factor, “Age-Mate Avoid- 
ance,” (AMA) was regarded by age- 
mates as one who dislikes school, gets 
by, has to be told what to do, does 
what he feels, yet depends on peers 
for approval. Impulsivity and avoid- 
ance by peers were characteristics of 
Ss who were deviant and lacked 
“model value.” 

A factor of “Socially Oriented 
Achievement Motivation” (SOAM) 
was also found. This factor was de- 
fined by acceptance of school and 
cultural standards and by scholastic 
motivation. Stability, restraint, rela- 
tively little criticism of education, and 
a tendency to be sociable were char- 
acteristic of Ss high in this factor. 

Finally, there is widespread support 
in the literature for the idea that high 
achievers are more motivated to 
achieve in school than are low achiev- 
ers (Taylor, 1964). 

In summary, high achievers tend 
to identify more with parents, author- 
ity figures, and with cultural norms, 
and they employ socially-accepted be- 
havior. They are motivated to achieve 
highly in school and gain satisfaction 
from their school activities. They are 
independent and leaderlike, and enjoy 
peer relations without being depend- 
ent on peer group support. Low achiev- 
ers, in contrast, tend to reject parental 
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and school values, resent authority, 
employ less socially-accepted behav- 
ior, are seen as deviant from the peer 
group, yet tend to lean on the peer 
group for support. 


: PROCEDURE 
The Sample 


Two Midwestern cities cooperated in this 
study. The larger city (population 175,000) 
includes a state university, is the seat of 
State government, is a tourist and trade 
center, and is the home of both heavy and 
light industry; all SES classes are present, 
although the majority of families would 
be considered of upper-middle or lower- 
middle SES. The smaller city (population 
40,000) is primarily industrial in character; 
the population tends to lower-middle SES. 

An initial sample of 310 eighth-grade 
junior high school boys was randomly drawn 
from 18 junior high schools. The Ss met the 
criterion of California Test of Mental Ma- 
turity IQ 120 or above, normal eighth-grade 
age, normal and comparable class loads, and 
absence of incapacitating physical defects 
or gross emotional disturbance. The sample 
was tested with the WISC, and Ss of IQ 
116 or above were retained. Attrition re- 
duced the final sample to 261 Ss. 

The final sample was divided into three 
equal groups of high, average, and low 
achievers, on the basis of an equal-weight, 
composite ranking of GPAs in the 7th and 
8th grades and total score on the Iowa Test 
of Basic Skills, Complete data were obtained 
for 88 high, 85 average, and 88 low achievers. 


Instrumentation 


Germane to this part of the study are the 
student interview and card sort. A structured 
interview with allowance for elaboration by 
Ss was administered individually. Based on 
Ringness’ (1963) previous study, the inter- 
view consisted of 55 questions tapping areas 
of occupational ambitions, identification 
with father, identification with teacher, ac- 
ceptance of school values, peer relations, 
attitude toward school marks received, out- 
of-school activities, and attitude toward 
heterosexual relationships. 

A rectangular card sort assessed dimen- 
sions of pupil self-report concerning inde- 
pendent behavior, nonconforming behavior, 
motive to affiliate with peers, and motive to 
achieve academically. Based on the earlier 
study, additional items were added to dif- 
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ferentiate more clearly between noncon- 
forming and independent behavior, validity 
of new items being judged by trainees in 
School Psychology. Eighty items were pro- 
vided, 20 relating to each dimension. Eight 
cards were sorted into each of 10 compart- 
ments, and represent scores from 1 to 10; 
low scores represented “most like me” and 
high scores represented “least like me.” Co- 
efficients of internal consistency were .51 for 
the dimension of independence, 71 for mo- 
tive to affiliate with peers, .85 for motive 
to achieve academically, and .93 for non- 
conforming behavior, as stepped up the 
Spearman-Brown formula. 


RESULTS 


The Interview 


Table 1 presents only the relevant 
interview data, together with chi- 
square tests of significance. It should 
be noted that because responses were 
sometimes elaborated, a given S's re- 
sponses might be included in two or 
more categories so that total responses 
to an item did not always equal 261. 

For many of the data the total re- 
sponses may be more interesting than 
the differences between groups. For 
example, Ss in this study enjoy a high 
frequency of activities with their fa- 
thers, and strongly admire them or ad- 
mire them with some reservation. Ss 
do not tend to admit to poor father- 
son relationships and there are no sig- 
nificant differences among achieve- 
ment groups in this regard. 

Although significantly more high 
than average or low achievers state 
that school is valuable for helping 
them develop their talents, and al- 
though significantly more average and 
low than high achievers stress the 
value of school for vocational prepara- 
tion, the majority of Ss view school 
as primarily useful for vocational 
preparation. Since "vocational prep- 
aration” at the eighth-grade level is 
primarily that of gaining general edu- 
cational tools, it would seem that 
(with the exception of a few high 
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TABLE 1 


InreRview Responses or Bricut Eramru-GRApE Boys 
(N = 88 high, 85 average, and 88 low achievers) 


Number of responses 
Interview area 
HA AA LA 
Identification with father 
3 Frequency of father-son activities: 
| Week-ends or oftener 66 66 71 
2 or 3 times per month 10 9 8 
Once a month 7 3 19 
Never 1 2 3 
Don't know 4 6 5 
Attitude toward father: 
Strong admiration 47 42 49 
Admire with reservations 29 27 24 
Not close 5 9 7 
Desire to be different 1 0 1 
Strong negative 0 2 1 
Don’t know 6 6 6 
Identification with school 
Values of school : 
Not important 5 3 2 
For occupational preparation only 58 754ta 82**a 
To develop one’s talents 17 2» Tt 
1 Only for interest in subjects 8 8 3 
Don't know 1 1 9 
Values of school for future occupation: 
Very important 51 52 58 
Somewhat important 18 14 11 
Little or no importance 12 11 10 
Don’t know 6 9 14 
dea tifeaticn with teacher’s Ge dus 
erceived teacher norms for model pupil role: 
Conformity a 79 66%" (Hx 
Social competence 2 14**a 9 
Academic competence 22 M a. 
Intellectual liveliness 1 12404 9 
Don't know 1 1 3 
Attitude toward perceived norms: 8 7 75 
Approve 
Indifferent 3 9 H 
Dislike 2 2 1 
Don't know 7 2 
Characterization of self like the perceived model: gy 
E 81 54**a 61 
Much like model g**a ore 
A little like model 0 wa gen 
Try to be like model 0 i 11 
Unlike model 6 1 7 
X Don’t know A 1 
erceived teacher opinion of S: " LT 
Good, S cares 55 pa s S 
So-so, S cares 19 8 11 
y Poor, S cares 6 1 2 
Good, S doesn't care 0 2 2 
So-so, S doesn't care 2 2 3 
Poor, S doesn't care ; n 7 


Don’t know 
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TABLE 1—Continued 


Interview area 


Peer relationships 
Characteristics of popular peers: 
Athletic 
Scholarly 
Good personality 
Athletic, good personality 
Scholarly, good personality 
Athletic, scholarly 
All three 
Don’t know 
Peer attitude toward good students: 
Admire 
Indifferent 
Depends on personality 
Square (negative attitude) 
Don’t know 
Characteristics of S’s most admired peer: 
Athletic 
Scholarly 
Personality 
Athletic, personality 
Scholarly, personality 
Athletic, scholarly 
All three 
Social status 
No admired peer 
Don’t know 
Peer attitude toward school tasks: 
Work hard as possible 
Work little more than is necessary 
Work enough to get by 
Don’t know 


‘Number of responses 

HA AA LA 
14 7 8 

5 0t» 3 
36 21** 21*8 
20 25 17 

6 9 18*^ 

2 5 5 

9 18 16 

4 0*s 4 t*a 
49 41 43 

8 12 8 
15 8 12 
13 22 23 

3 4 3 

6 8 10 
26 12* 20 
31 20 25 

3 3 5 
15 10 12 

0 0 0 
29 22 20 

0 1 0 

2 6 4 

5 5 6 
12 13 9 
18 21 15 
55 47 61**> 

3 5 3 


Note.—All chi-square tests of significance. 


* Comparison with high achievers. 
> Comparison with average achievers. 
*p < .05. 

hes) < 01, 


achievers) school is more of a rou- 
tinely-accepted environment than a 
place which is viewed as important to 
self-development or which provides 
interesting content and activities. 
This, in turn, may be related to the 
perceived teacher norms for model 
pupil behavior. 

Most Ss state that teachers view 
the “model pupil” as conforming. For 
the sample as a whole, academic and 
social competence and intellectual 


liveliness fare less well. Although there 
are statistically significant differences 
in the ways high achievers differ from 
average and low achievers as to per- 
ceived norms of social competency 
and intellectual liveliness, it is note- 
worthy that there is high agreement 
concerning conformity. By implica- 
tion, achievers must be conformers, 4 
finding which agrees with those of 
studies cited earlier. 

Differences in achievement are not 
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TABLE 2 
Carp Sort Data ror Bricut EranrH-GRADE Boys 
(N = 88 high, 85 average, and 88 low achievers) 
Scores Mean differences 
Dimension 
sD Groups Difference t 

Nonconformity: 

High achievers 7.30 -74 H-A —.02 17 

Average achievers 7.32 82 H-L .40 3.64" 

Low achievers 6.90 :73 A-L 42 3.50* 
Independence: i 

High achievers 4.98 .64 H-A —.08 .80 

Average achievers 5.06 .68 H-L —.26 2.60* 

Low achievers 5.24 .69 A-L —.18 1.80 
Achievement motivation: 

High achievers 4.17 Ban H-A —.32 2.07* 

Average achievers 4.49 .80 H-L —.81 6.23* 

Low achievers 4.98 .90 A-L —.49 3.77* 
Affiliation motivation: . 

High achievers 5.15 .96 H-A .46 3.07* 

Average achievers 4.69 1.04 H-L .60 3.33* 

Low achievers 4.55 .87 A-L 14 .93 


*p « Ol. 


related to differences in pupil percep- 
tions of teacher expectations, but they 
are related to attainment of the model 
pupil role. Although most Ss state that 
they approve the conformity role, 
high achievers feel that they are like 
the model more than do average or 
low achievers; average and low 
achievers state significantly more 
often that they are a little like the 
model, and average achievers mention 
that they "try to be" like the model. 
There are no significant differences in 
the extent to which achievement 
groups state that they are unlike the 
model. Most Ss say that they care 
how they are perceived by the teacher. 

Taken as a whole, these data sug- 
gest that most Ss identify well with 
their fathers and with teachers. They 
perceive school as a place one attends 
for vocational purposes, and where 
conformity is demanded. At this point 
in their lives they are generally willing 
to accept the conformity role, and care 
^ they are regarded by their teach- 


In regard to peer relationships it is 
patent that having a "nice personal- 
ity" is the most, important factor in 
popularity. Good students are ad- 
mired by about half the sample, but 
35 Ss mention that a good student is 
popular primarily if he has appropri- 
ate personality attributes. Fifty-eight 
Ss, or about one-fifth of the sample, 
describe good students as "square" 
and another 28 are indifferent to 
scholarship attributes. These findings 
are also borne out in responses to ques- 
tions concerning characteristics of 
most-admired peers. The majority of 
Ss feel that the effort norm of the peer 
group is that of doing just enough to 
“get, by,” although 54 think most peers 
do a little more than necessary. 

Thus although school norms are 
perceived as those of conformity and 
Ss accept this norm and attempt to 
be like the perceived model, the model 
js not viewed by the majority as that 
of intellectual development so much 
as that of “good behavior.” This seems 
related to the responses which indicate 
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that peer achievement norms are es- 
sentially those of “getting by” or doing 
"just a little more than necessary." 
Popularity seems to result from pos- 
session of a pleasing personality; ath- 
letie prowess, scholarship, and other 
attributes may supplement such per- 
sonal qualities, but are in themselves 
not conducive to peer acceptance. 


The Card Sort 


Table 2 presents card sort data, to- 
gether with two-tailed tests of sig- 
nificance of differences. It will be re- 
called that scores might range from 1 
to 10, with lower scores being “most 
like me." 

It is seen that high and average 
achievers are less nonconforming or 
less rebellious than low achievers. On 
the other hand, high achievers are 
more independent and autonomous 
than low achievers. 

High achievers are significantly 
more oriented toward academic 
achievement than average achievers, 
who are significantly more oriented 
toward achievement than are low 
achievers, But average and low achiev- 
ers are significantly more oriented to 
affiliate with peers than are high 
achievers, 


Discussion 


This study was undertaken partly 
to confirm or refute findings of an 
earlier study. In most respects but not 
all, confirmation was found. Within 
the limits of changes in sampling and 
instrumentation mentioned earlier, 
some comparison of findings may be 
made. 

In both studies the father was ad- 
mired, or admired with some reserva- 
tions, by a majority of Ss; however, 
the 1963 findings showed that fathers 
of high achievers spent more time with 
their sons than did fathers of low 
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achievers, suggesting closer father 
identification, but this was not found 
in the present study. It cannot be con- 
cluded from the present (and larger) 
study that father identification differ- 
entiates between high, average, and 
low achieving Ss. There is one possi- 
bility, which was not attacked in 
either study, which might account for 
the differential findings. It was noted 
earlier that the prior study dealt with 
a sample from the upper-middle SES 
population whereas the present sample 
dealt with all SES groups. If, as was 
suggested, upper-middle SES fathers 
are more oriented toward high scho- 
lastic achievement than fathers of 
other SES groups, differences in the 
two studies may simply represent dif- 
ferences in parental value-orienta- 
tions. This tends to confirm McGuire’s 
notion that talent factors are possibly 
community-bound. At the moment 
this must be regarded as an area for 
future research. 

In regard to identification with the 
teacher or with school values, both 
studies showed Ss to perceive the 
teacher’s concept of the model pupil 
as one who conforms, or poses no prob- 
lems in the classroom. In the 1963 


: study high achievers approved this 


role concept but low achievers did 
not; the present data show no signifi- 
cant differences among groups. How- 
ever, both studies show that high 
achievers feel that they are more like 
the role model than are other groups; 
although most Ss desire to be viewed 
as like the role model. 

Data in both investigations showed 
that school is viewed by most students 
as valuable for vocational prepara- 
tion; however, high achievers more 
than others saw school as a place to 
develop one’s talents. “Good students” 
were accepted by about half the Ss in 
each study; the 1963 data showed that 


' 


[ 


— i 


n DN" 


IDENTIFICATION PATTERNS, MOTIVATION, AND SCHOOL ACHIEVEMENT 


high achievers admired scholars more 
than did low achievers, but this differ- 
ence was not found in present data. 
Good students are characterized as 
"squares" by about one-fifth of the 
present sample; in previous data low 
achievers were more likely to make 
this statement. In both studies the 
peer norm for academic achievement 
was that of doing just enough to “get 
by." 

Both studies showed that high 
achievers are motivated to achieve 
academically more than other groups, 
whereas low achievers are more moti- 
vated to affiliate with peers. High 
achievers are presently found to be 
independent, whereas low achievers 
are somewhat nonconforming; this 
finding appears to document state- 
ments of Taylor (1964). 

The work of McGuire et al. isolated 
certain predictors of talent which were 
noncognitive. Among these were Peer 
Stimulus Value, Age-Mate Avoidance, 
and Socially Oriented Motivation. 
(They also found male-specific fac- 
tors of Anxious Emotionality and 
Antisocial Wariness, but these were 
less clearly specified.) PSV suggests 
that the talented are likely to be 
leaderlike, independent, and effective. 
This is similar to present findings in 
the card sort that high achievers are 
more independent than other Ss. PSV, 
however, suggests that possessors are 
models for peers. Interview findings of 
the two studies by Ringness attacked 
the questions of popularity and “most- 
admired peers" and it was found that 
models for these samples were chosen 
more for “personality” than any other 
Teason. 

The AMA factor is similar to card 
sort findings that low achievers are 
More nonconforming, yet more affilia- 
tion-oriented than high achievers. The 
SOAM factor is essentially confirmed 
by both card sort and interview find- 
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ings that high achievers are motivated 
to conform to school and adult values, 
they identify well with home and 
school, and are academically moti- 
vated. 

The main difference in the findings 
of the McGuire and the present study 
is in the implications in the former 
that peers tend to admire the talented, 
whereas in present findings popu- 
larity and admiration are associ- 
ated with achievement primarily by 
high achievers, when so associated at 
all. In other respects, although the 
factors combined attributes of achiev- 
ers in a somewhat different fashion 
from the present study, essentially the 
same characteristics were found; these 
are also similar to those summarized 
in Taylor’s review. 

There seems sufficient evidence to 
support the conclusion that low 
achievers identify more with the peer 
group and are governed more by the 
peer group than are high achievers. 
Since the peer norm for achievement 
is seen as that of mediocrity, and since 
low achievers state less motivation for 
academic achievement, it would follow 
that an important problem to schools 
is that of finding ways to foster high 
achievement values in the peer group 
at large. The image of the scholar as 
“square” must be erased. It is possible 
that this image is fostered partly by 
the fact that a certain percentage of 
high achievers are not well-rounded 
persons, lack desirable personality 
attributes, and are not popular. 

The image of the teacher needs re- 
vision. He is viewed as demanding 
conforming behavior, but is not seen 
as fostering intellectual development 
or liveliness. Schools are seen as places 
to prepare for future vocations, but 
are seen by relatively few as places to 
develop talents, pursue interest, or to 
improve social adjustment. 

The present study refutes the no- 
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tion that high achievers identify more 
with their fathers than do low achiev- 
ers. However, at junior high school 
age, parent identifieation apparently 
does not provide reinforcements for 
academic achievement to low achiev- 
ers to the same extent it does for high 
achievers. It is possible that differ- 
ential parent values are operating, and 
it would seem desirable for school 
counselors to discuss such matters 
with fathers of low achievers when the 
attainments of the latter are of suffi- 
cient concern. 

One other comment may be justi- 
fied. It is apparent that the stereotype 
of school tasks as being necessary 
evils seems still to exist. Social rein- 
forcements such as teacher praise and 
blame, grades and marks, and other 
typieal reinforcements provided by 
the school are not as effective with low 
achievers as could be wished. The im- 
plication is that much more effort 
needs to be spent on finding other 
kinds of reinforcers for academic 
achievement. Efforts to employ con- 
crete reinforcers may provide some 
clues; efforts to make the curriculum 
more meaningful to boys like those in 
these samples may also bear fruit. 
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EFFECTS OF STRATEGY, SEX, AND AGE ON 
CONCEPTUAL BEHAVIOR OF ELEMENTARY 
SCHOOL CHILDREN 


GLENN E. TAGATZ* 
Indiana State University 


The effects of strategy, sex, and age on 4 concept-attainment problems 
were examined. Strategies were a commonality and a conservative 
method of solution. The age and sex dichotomies resulted from 5th 
and 6th grades. Time-to-criterion scores were the dependent variable. 
Departure from instructional strategy was examined using chi-square. 
Statistically significant differences were found for all main effects and 
the repeated measures in the first analysis. The strategy difference fa- 
vored the commonality solution. The sex difference favored the fe- 
males. The 6th grade was less efficient than the 5th grade, indicating 
greater awareness of the complexity of the task and of the stimuli. 
The chi-square analysis indicated statistically significant preference 


among all Ss for the commonality strategy. 


Researchers, Bruner, Goodnow, and 
Austin (1956), Klausmeier, Harris, 
and Wiersma (1964) and Miller, Gal- 
anter, and Pribram (1963), have at- 
tempted to identify strategies used by 
subjects (Ss) in the attainment of 
concepts. Following behavioristie tra- 
dition these efforts have largely con- 
sisted of examination of instances se- 
lected (selection strategies) or the 
relationship between instances and 
subsequent behavior (reception strat- 
egies). The concept of strategy as & 
plan allows another approach which 
has, until recently, been overlooked in 
attempts to identify strategies. This 
consists of (a) “programming” Ss with 
the plan or strategy, (b) having Ss 
attain a concept, and (c) then making 
subsequent examination to determine 
whether or not the respective strategy 
has been followed and to determine 
the relative efficiency of Ss using the 
strategies studied. The objectivity of 
earlier efforts is maintained; and ad- 
ditionally, the relative efficacy of the 
strategies compared may be appraised. 


2The author is presently a Postdoctoral 
Fellow at the Research and Development 
Center for Learning and Re-Education at 
the University of Wisconsin. 


This particular method of studying 
strategy has the additional advantage 
of eliminating some of the residual 
variability in an analysis which arises 
from the fact that Ss previously op- 
erated initially without a clear frame 
of reference as to any of the possible 
methods of attainment. Hence early 
attempts by Ss were characterized by 
inefficiency regarding knowledge of 
method in addition to inefficiency re- 
garding knowledge of materials, that 
is, levels of attributes, number of attri- 
butes relevant to the concepts in the 
study, etc. 

The influence of the sex of an S on 
concept attainment has been consid- 
ered by various researchers with vary- 
ing results; Klausmeier et al. (1964), 
Tagatz and Meinke (1966), and Olson 
(1963). In the first two studies, stu- 
dents of college age were used. In 
Olson's study adolescents were used. 
The first two failed to find a statisti- 
cally significant sex effect; the last 
did. Olson concluded that the demon- 
stration of sex differences depends 
upon the particular concept to be 
learned. It may depend also upon de- 
velopment and concomitant charac- 
teristics. 
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The age of Ss, then, is another vari- 
able which must be given attention in 
conjunction with the study of concept 
attainment. Yudin and Kates (1963) 
found that 14- and 16-year-old Ss 
were significantly superior to the 12- 
year-old Ss in their study. They con- 
cluded that their results confirm the 
deseription of adolescent development 
advanced by Inhelder and Piaget 
(1958), who suggest that formal rea- 
soning begins at about age 12 years 
and reaches equilibrium at about 14— 
16 years of age. In the present study, 
Ss were selected from fifth and sixth 
grades in an attempt to examine per- 
formance immediately prior to and at 
the age at which such formal reason- 
ing begins. If performance on tasks 
used in this study reflects formal rea- 
soning, then differences between the 
two grades should be evident. 


THE PROBLEM 


The purpose of this study, then, 
was to investigate the influence of 
certain variables upon efficiency of 
concept attainment. Specific questions 
examined were: 

1. What are the effects of the fol- 
lowing four variables upon concept 
attainment: (a) grade level—fifth or 
sixth; (b) sex; (c) instructed method 
of solution—conservative or common- 
ality strategy; and (d) repeated meas- 
ures, that is, four problems? 

2. Is there a significant preference 
among fifth- and sixth-grade students 
for either one of the two strategies? 

In this study the conservative strat- 
egy consisted of a verbally presented 
algorithm for concept attainment in 
which information processing of one- 
dimensional differences was specified. 
For the commonality strategy, a 
method of solution was articulated 
whereby the defining characteristics 
common to a set of conceptual exem- 
plars were specified as the concept, 
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provided complete dimensional varia- 
tion was presented. 


Metuop 


Subjects 


The Ss for this study consisted of a 
stratified, random sample of fifth- and 
sixth-grade students from the Campus Ele- 
mentary School at Oshkosh, Wisconsin. 
From each grade level 10 males and 10 fe- 
males were included. Thus a total of 40 
students participated. 


Material 


Stimulus material for the study consisted 
of a display board containing 32 cards. Each 
card contained one of two levels of figurally 
presented information for each of five di- 
mensions. The five dimensions and their 
defining characteristies (levels) follow: (a) 
border number: one or two; (b) border 
continuity: solid or broken; (c) figure num- 
ber: one or two; (d) figure color: purple or 
yellow; (e) figure shape: square or triangle. 

On every card one level of each of the 
five dimensions was presented. Each card 
was different from every other card in one 
or more of the five possible ways. This 
stimulus construction permitted conjunctive 
categorization of the cards on the board 
into groups possessing common characteris- 
tics and those not possessing the combina- 
tion of characteristics. 


Design and Procedure 


Upon arrival at the learning laboratory 
Ss were told the general nature of the ex- 
periment and then received one of two sets 
of instructions. Each set had in common 
general directions about the conduct of the 
experiment and special instructions about & 
possible method of solution. d 

In the general directions, each pupil was 
told that the experiment was concerne 
with how people learn and that he would be 
given an opportunity to work several prob- 
lems. The stimulus construction was eX 
plained and repeated until Ss were well 
aware of the five dimensions. This degree 
of familiarity was assured by having stu- 
dents describe each of the five dimensions 
as the next step in the general directions. 
Pupils were made aware of the way that 
cards could be grouped into those cate- 
Bories possessing common characteristics 
and those not possessing the characteristics. 
Thus, a concept was defined for the students. 
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It was additionally explained that a con- 
cept was merely a rule for classification. 
Understanding of categorization was de- 
termined by having Ss specify which cards 
belonged to a category articulated by the 
examiner; for example, Ss were asked to 
identify which cards belonged to the con- 
ceptual category “two borders with purple 
figures.” A verbal response of “yes” was 
given to cards which illustrated the rule 
and a verbal response of “no” was given to 
cards which did not, The Ss were then in- 
formed that their job was to determine the 
rule that the examiner had in mind when 
“yes” and “no” markers were arranged on 
the cards constituting the stimulus display. 
At this particular point, the special instruc- 
tions were given to each respective group. 
The instructions for the commonality and 
the conservative strategies follow: 
Instructions for Commonality Strategy: 
You will be able to attain the concept 
successfully by giving your entire atten- 
tion to the “yes” cards. To determine 
what is the correct concept, examine the 
dimensions of the “yes” cards. Those di- 
mensions which are common to all “yes” 
cards specify the relevant features of the 
concept. As an example, consider the fol- 
lowing three cards: (Figural dimensions 
are verbally presented.) 
Card 1 (yes) 
. One Border 
. Solid Border 
. Two Figures 
. Purple Figures 
Triangular Figures 
Card 2 (yes) 
Two Borders 
Solid Borders 
. Two Figures 
. Yellow Figures 
. Square Figures 
Card 3 (no) 
One Border 
. Broken Border 
. Two Figures 
. Purple Figures 
. Square Figures 
The markings indicate that the first two 
cards belong to a group of cards which 
exemplify a concept; the third does not. 
Both “yes” cards possess the dimensions 
of two figures and solid borders. They are 
different in the other dimensions. The 
concept or underlying rule might be, 
therefore, (a) “two figures,” (b) “solid 
borders,” or (c) “two figures and solid 
borders.” By examining other “yes” cards, 
you could determine exactly which of the 
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three possibilities is correct, If some other 
“yes” card contained one figure, the only 
correct concept would then be “solid bor- 
ders.” If all the “yes” cards contained two 
figures and solid borders, “two figures and 
solid borders” would be the concept. 
Instructions for Conservative Strategy: 

You will be able to attain the concept 
successfully by examining cards which do 
and do not belong to the group of cards 
which exemplify the concept. Specifically, 
a “yes” card which differs from a “no” 
card in one dimension tells you one rele- 
vant or important dimension of the con- 
cept. As an example, consider the follow- 
ing three cards: (Figural dimensions are 
verbally presented.) 

Card 1 (no) 

. Two. Borders 

. Solid Borders 

. Two Figures 

. Purple Figures 

Square Figures 

Card 2 (yes) 

Two Borders 

. Broken Borders 

Two Figures 

. Purple Figures 

Square Figures 
Card 3 (?) 

Two Borders 

. Solid Borders 

. Two Figures 

. Yellow Figures 

. Square Figures 

The center card [Card 2] illustrated the 

concept. The card on the left [Card 1] 

does not illustrate the concept. These two 

cards are different in only one dimension, 
namely, border continuity. Therefore, 
cards which illustrate the concept must 
have broken borders. Would the card on 
the right [Card 3] illustrate the concept? 

You should have answered this ques- 
tion with a “no.” The card on the right 
has solid borders. We know from the other 
two cards that cards which illustrate the 
concept must have broken borders. The 
third card then cannot illustrate the con- 
cept. At this point you would know one of 
the relevant features of the concept. By 
examining other “yes” and “no” card 
combinations, you can determine exactly 
what is the correct concept. 

Following the special instruction all Ss 
were shown how to respond with their con- 
cept. This was done by using a slip of „paper 
on which the five dimensions and their two 
levels were specified. The Ss' task was to 
check the relevant level of dimensions 
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which specified the concept. Upon comple- 
tion of these instructions each S was given 
an opportunity to ask questions about any- 
thing which was not clear. When questions 
did arise, the appropriate section of the 
instructions was repeated and explained. 

In the experiment Ss were instructed to 
solve four problems. Problems were con- 
junctive with two dimensions relevant for 
each. Two exemplars of the concept—one 
the exemplar focus card—and two non- 
exemplars were specified for each problem. 
The exemplar which shared with the focus 
card the two relevant dimensions was indi- 
cated with a “yes” marker. Two nonexem- 
plars, which differed from the focus card in 
only one relevant dimension each, were 
specified with "no" markers. Hence, equal 
amounts (bits) of exemplar and nonex- 
emplar information were presented regard- 
ing each concept. 

Data of two types were recorded by the 
examiner; namely, (a) time taken to attain 
successfully the concept, and (b) which 
cards S used in the attainment of the con- 
cept. This last variable was secured at the 
completion of all four problems. By know- 
ing which cards S used in the attainment 
of the concept, it was possible to determine 
whether he had used the respective strategy 
set forth in the special instructions given 
him. Upon completion of the fourth prob- 
lem, then, Ss were asked which cards they 
had used in the attainment of the concept. 
It was expected that Ss who received the 
commonality instructions would indicate 
use of the exemplars and that Ss who re- 
ceived the conservative instructions would 
use the nonexemplars. Departure from this 
conformity to strategy was appropriately 
noted by the experimenter for later analysis. 


ANALYSIS 


Time-to-criterion scores were ex- 
amined in a 2 X 2 X 2 factorial, re- 
peated measures design. Stratifying 
variables were grade—fifth and sixth 
—and sex. The treatment variable 
consisted of a comparison of the con- 
servative and commonality strategies. 
These three main effects and their in- 
teractions constituted the first part of 
the analysis. The second part was the 
examination of repeated measures, 
that is, the four problems. The third 
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and last part of the analysis consisted 
of the interaction of the repeated 
measures with the main effects and 
their interactions. 

The second dependent variable— 
departure from instructional mode 
(strategy) —was analyzed using chi- 
square techniques. In this analysis 
the frequency of departure from the 
instructional method of solution was 
collapsed over stratifying variables 
leaving a 2 X 2 table, namely, com- 
monality versus conservative instruc- 
tions and commonality versus conser- 
vative solutions. Thus the relationship 
of instruction to method of solution 
was examined. 


RESULTS 


Table 1 presents the results of the 
analysis of variance performed on 


TABLE 1 
ANALYSIS OF VARIANCE ON TIME- 
TO-CRITERION SCORES FOR 
CoNcEPT-ATTAINMENT 


PROBLEMS 
Source af | MS F 
Grade (G) 1| 9.79 | 4.20* 
Sex (S) 1| 11.66 | 5.03* 
Instructional 
Strategy (I) 1 | 36.66 | 15.80** 
Gxs 1 51 E 
GXI 1| 6.26 | 2.69 
SxI 1 | 12.11 | 5.22* 
GXSXI 1 .61 = 
S/Treatments 32 | 2.32 
Repeated Measures | 3 | 31.66 | 20.69** 
G X RM 3| 5.03 | 3.29* 
S8 X RM 3 .99 J 
IX RM 3 | 14.36 | 9.39** 
Gx8 x RM 3 +27 SE 
GXIXRM 3| 8.06 | 5.27** 
SXIXRM 3| 1.43 x. 
GXSXIXRM 3 .69 d 
S/T X RM 96 | 1.53 
x 1 
Total 160 
*p< 05. 
** p < 01. 
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time-to-criterion scores? Tables 2 
through 6 present mean times-to- 
criterion for selected significant effects 
in Table 1 and their respective Dun- 
can’s range. 

Of the three main effects and their 
interactions, four sources of variance 
were significant. The Ss that had re- 
ceived the commonality instructions 
were more efficient than those that 
had received the conservative instruc- 
tions. The mean for the Commonality 
Instructional Group was .90 minutes, 
for the Conservative Instructional 
Group 1.86. The sixth graders demon- 
strated less efficient solution than did 
the fifth graders, the mean for the 


TABLE 2 
Mxzaw TrMES-TO-ÜRITERION FOR SEX 
X STRATEGY INTERACTION AND 
DuNcAN's RANGE 


- Com- ^ 
Subjects | servative | monality | Duncan's 
instructions |instructions 
Males 2.41 .90 8 
Females 1.32 :91 d 


fifth graders being 1.13 minutes, for 
the sixth graders 1,63 minutes. This 
difference was significant at the .05 
level. The females were more efficient 
than the males; the mean time for the 
females was 1.11 minutes, for the 
males 1.65 minutes. This difference 
was significant at the .05 level. Table 
2, the Sex x Instructional Strategy 
interaction, indicates that males re- 
ceiving conservative instructions were 
significantly less efficient than the 
females receiving that set of instruc- 
tions as well as both males and fe- 
males receiving commonality instruc- 
tions. 


Levels of significance of F-ratios in- 
volving repeated measures reflect considera- 
tion of Greenhouse and Geisser (1959). 


TABLE 3 
MEAN TIMES-TO-CRITERION FOR 
REPEATED MEASURES AND 
Düncan’s RANGE 


Ordinal position 
Duncan’s 
One | Two | Three | Four | e 
2.66 1.27 .91 .68 1.9 


Table 3 presents the means and 
Dunean's range for the four ordinal 
positions. The difference between Or- 
dinal Position 1 and Ordinal Position 
4 was significant. The remaining dif- 
ferences were not significant. The fact 
that the first problem was especially 
difficult for Ss attempting to use the 
conservative strategy is shown in 
Table 4. None of the other differences 
were significantly different in this 
Instruction x Repeated Measures in- 
teraction. 

The significant Grade x Repeated 
Measures interaction, the means for 
which are reported in Table 5, resulted 
because the performance of the sixth- 
grade Ss was significantly different on 
the first problem than the third and 
fourth problems and from all four 
problems for the fifth-grade Ss. A 
similar phenomenon is evident in 
Table 6 where the Grade X Instruc- 
tion x Repeated Measures interaction 
reveals Ordinal Position 1 to be es- 
pecially difficult for sixth-grade Ss 
who have received the conservative 


TABLE 4 
Mean TrMES-TO-ÜRITERION FoR STRATEGY 
X REPEATED MEASURES INTERACTION 
AND Duncan’s RANGE 


Ordinal position 
5 ican's 
Instruction range 
One | Two |Three| Four 
Conservative 


4.041.45]1.14| .82 2.5 
Commonality |1.29|1.10| .07| .55 
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TABLE 5 
MBEAN TIMES-TO-CRITERION FOR GRADE X 
REPEATED MEASURES INTERACTION AND 
DuNcAN's RANGE 


m Ordinal position Los 
One | Two |Three | Four 

Fifth 1.89| 1.14| .88 | .63 2.5 

Sixth 3.44) 1.41) .94 | .74 $ 


instructions. Close examination re- 
veals other significant ranges for 
which psychological implications are 
not readily apparent. 

The chi-square analysis of Ss' de- 
partures from instructional strategy 


TABLE 6 
Mzan TrMES-TO-CRITERION FOR GRADE X 
SrRATEGY X REPEATED MEASURES 
INTERACTION AND DUNCAN's RANGE 


Ordinal position 
Jrstrpctional Grade Duncan's 
p T range 
is One | Two|Three|Four 
Conserva-| Fifth|2.40/1.41|1.06] .79 
tive Sixth|5.68/1.4811.22| .84| 1. 
Common-| Fifth 1.38| .87| .69| .47| ~ 
ality | Sixth|1.20/1.33| .65| .63 


was statistically significant at the .001 
level. Table 7 presents the respective 
values of this analysis. It is evident 
from this analysis that there is a 
distinct preference for the commonal- 
ity strategy when it is contrasted with 
the conservative strategy and when 
Ss are elementary school children from 
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the fifth and sixth grades. Six fifth 
graders and four sixth graders of the 
10 from each grade which received 
the conservative instructions reverted 
to the commonality solution. No Ss 
switched from the commonality in- 
structional strategy to the conserva- 
tive solution. 


Discussion 


Two distinct strategies, both of 
which lead to the attainment of con- 
cepts, have been contrasted. The sig- 
nificant difference between them in- 
dicates that most of the fifth and 
sixth grade Ss used in the study were 
not able to use the conservative strat- 
egy. Of those that did, all required 
above the median time-to-criterion 
over all problems revealing the con- 
servative strategy to be less efficient 
for young Ss in the intermediate 
grades, The nature of the conservative 
strategy requires greater use of for- 
mal logic than does the commonality 
plan; and hence, this finding adds 
empirical support to Inhelder and 
Piaget (1958). Additionally suppor- 
tive is the fact that the performance 
of sixth-grade was inferior to that of 
fifth-grade students. It seems likely 
that the sixth graders as a group be- 
came aware of the greater complexity 
of the task and the combinatorial as- 
pects of stimuli than did the fifth 
graders. This is further supported by 
the significant Grade x Instruction x 
Repeated Measures interaction in 
which the sixth graders who had re- 


TABLE 7 
CHI-SQUARE or DEPARTURE FROM INSTRUCTIONAL STRATEGY 


Commonality solution Conservative solution 
Commonality instruc- (20 = 15)? _ 1.66 0-5 _ E 
tion x 15 5 
Conservative instruc- (10 — 15)? ES ion (10 — 5 ES 
tion 15 5 


30 


10 40 x! — 13.32 
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ceived the conservative instructions 
were signifieantly less efficient ini- 
tially than all others. It appears that 
the fifth graders attained the concept 
by looking at the common elements 
(class inclusion) and just missed or 
ignored anything else—unless their 
attention was drawn specifically to it 
—thus demonstrating the absence of 
concern for finer discrimination of ele- 
ments which is present in more formal 
thought. The significant Grade X Re- 
peated Measures interaction also sup- 
ports this conclusion as does the 
Instruetion X Repeated Measures in- 
teraction. More efficient behavior oc- 
curred as Ss reverted to Piaget’s class 
inclusion operation at the concrete 
level. 

While not a unique finding of this 
study, the significant sex effect does 
reflect organismic differences in con- 
ceptual behavior. It is doubtful that 
this significance resulted because of a 
Sex x Concept interaction as was 
Olson’s conclusion, Rather it is likely 
that the difference resulted because of 
advanced verbal development of fe- 
males. The significant Sex X Instruc- 
tion interaction also supports this 
conclusion. Males would generally be 
expected to perform in a fashion su- 
perior to females on an analytic task. 
As this is not the case, the verbal 
superiority explanation seems most 
tenable. 

Also of major significance is the 
empirieal validation that "strategy" 
can be effectively studied through a 
three stage process, namely (a) “pro- 
gramming” Ss with the plan or strat- 
egy, (b) having Ss attain a concept, 
and (c) then making subsequent ex- 
amination to determine whether or 
not the respective strategy has been 
followed and to determine the relative 
efficiency of Ss instrueted to use the 
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strategies studied. Results of the pres- 
ent study reflect different develop- 
mental levels in children's thinking. 
The reduced efficiency of the sixth- 
grade Ss reflected initial preoccupa- 
tion with a combinatorial system 
(formal thought) not present in the 
fifth-grade Ss. This apparently inher- 
ent and increased mediational activity 
in the older Ss seemed to be reinforced 
by instructions regarding a conserva- 
tive strategy though many Ss so “pro- 
grammed” eventually reverted to a 
class-inclusion operation. Thus it was 
shown that the three-stage process can 
be effectively used to study conceptual 
behavior even to the extent of com- 
paring “programmed strategies” in- 
teracting with other variables. 
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IMPLICIT ASSOCIATIVE RESPONSE OCCURRENCE IN 
LEARNING WITH RETARDED SUBJECTS: 
A SUPPLEMENTARY REPORT 


WILLIAM P. WALLACE' 
Northwestern University 


100 words were read to groups of normal and retarded Ss. As each word 
was read Ss were required to indicate whether it had been read earlier in 
the list. Words appeared in the list which were presumed to elicit spe- 
cific implicit associative responses (IARs), and later in the list the pre- 
sumed IARs were presented. A significant interaction indicated that 
this manipulation resulted in a greater increase in error rates for normal 
Ss than for retarded Ss. The results were interpreted as supporting the 
notion that retarded Ss make fewer or weaker IARs when presented 


with verbal units. 


The classification of mental retarda- 
tion implies a learning deficiency. An 
understanding of the nature of this 
learning deficiency should be of prac- 
tical importance. When specific de- 
ficiencies are understood, classroom 
iraining can proceed most efficiently, 
compensating for weaknesses of the 
students and capitalizing on their 
Strengths. The present study was de- 
signed to investigate one potential 
source of deficiency in learning of re- 
tarded subjects (Ss), namely the oc- 
currence of implicit associative re- 
sponses. 

The purpose of the present study 
was to investigate the influence of 
implicit verbal responses in producing 
recognition errors among normal and 
retarded Ss. A familiar word pre- 
sented as a stimulus in a verbal-learn- 
ing task may elicit two types of im- 
plicit responses. Bousfield, Whitmarsh, 
and Danick (1958) have labeled as 
the “representational response” (RR) 
the response to the word itself as the 
act of perceiving it. The second kind 
of implicit response, the implicit asso- 
ciative response (IAR), is produced 
by the stimulus properties of the RR. 
and is conceived of as being related to 
the RR. For example, as the act of 


1 Now at the University of Nevada. 


perceiving the letters po a an RR of 
“dog” is presumed to occur. Implicit 
associative responses are presumed to 
occur to the RR, hence “cat” is a 
presumed IAR to the RR “dog”. It 
is assumed that specific IARs may 
be predicted from a knowledge of pre- 
experimental associative habits. The 
influence of IARs in producing recog- 
nition errors was of interest in the 
present study. 

The task was similar to one used 
by Underwood (1965). The Ss were 
presented with a series of words and 
were asked to indicate whether each 
successive term was “old” or “new,” 
that is, whether it had occurred before. 
In the list were critical words assumed 
to elicit particular IARs. At a later 
point in the list the words assumed to 
be IARs (experimental words) to the 
earlier RRs were presented. Under- 
wood found there was a significantly 
greater tendency for Ss to respond in- 
correctly to experimental words than 
to control words (words presumed not 
to have occurred earlier as IARs). 
It was as if the experimental words 
had occurred earlier in the list. That 
is, Ss were more likely to recognize 
falsely cat when dog appeared earlier 
in the list as compared to table when 
chair had not appeared earlier in the 
list. 
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Wallace and Underwood (1964) pre- 
sented evidence consistent with the 
position that retarded Ss respond to 
verbal units with fewer or weaker 
JARs. Their results indicated that with 
college Ss IARs to conceptually re- 
lated words facilitated free learning 
and interfered with paired-associate 
learning. With retarded Ss, manipula- 
tions of conceptual similarity had 
little effect upon free learning and 
paired-associate learning, resulting in 
a triple interaction between Class of 
S x Type of Task X Similarity. It 
was suggested that the triple inter- 
action was due to a weaker or less 
frequent IAR occurrence among re- 
tarded Ss as compared to normal Ss. 

The present study was designed to 
investigate the influence of implicit 
verbal responses in producing recog- 
nition errors among normal and re- 
tarded Ss. Underwood (1965) demon- 
strated that error rates were higher 
for specific words that were presumed 
to have occurred earlier in the list, as 
IARs, when compared to specific 
words presumed not to have occurred 
earlier as IARs. Wallace and Under- 
wood presented evidence suggesting 
that there is an IAR deficieney with 
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retardates. It follows that retardates 
should exhibit less difference than 
normals in error rates between words 
presumed to have occurred as IARs 
earlier in a list and words presumed 
not to have occurred as JARs earlier. 
Thus, an interaction between Class of 
S x Type of Word is predicted. 


METHOD 


Materials 


The materials were taken from the Un- 
derwood (1965) recognition list of 200 words. 
From this list 100 words were selected to be 
used in the present study. The words used 
may be separated into four categories ac- 
cording to their assumed relation to other 
list words. Comprising one category were 
critical stimulus (CS) words. These words 
were assumed to elicit specific IARs. Ex- 
perimental (E) words constituted a second 
category. These words represented the as- 
sumed IARs to the CS words. The third 
class consisted of control (C) words. These 
words served as controls for the E words 
and were presumed not to have been pre- 
ceded in the list by words for which they 
were IARs. The final category contained 
filler (F) words. These words were presumed 
to be neutral with regard to E words and 
were used to establish specific repetition 
frequencies. 

The list consisted of 15 E words, 15 C 
words, 44 CS words, and 26 F words, There 
were 14 repeated words, that is, words oc- 


TABLE 1 
CRITICAL STIMULUS WORDS, EXPERIMENTAL Worps, anD Controt Worps 
Class CS words E words Position C words 
A Top 57 | Down 
3 diia Take 67 Good 
DAY Night 85 Low 
MAN Woman 96 Rich 
A; ROUGH Smooth 77 Weak 
FALSE True 80 Dirty 
HARD Soft 89 Short 
SLOW Fast 97 Girl 
CV | BUTTER, CRUMB Bread 64 Bridge 
SUGAR, BITTER, CANDY Sweet 74 Salt 
WARM, CHILL, FREEZE, FRIGID, HOT, ICE Cold 76 Cloud 
DARK, HEAVY, LAMP, MATCH Light 88 Leg 
SO | PEACH, GRAPE, APPLE, PEAR Fruit 79 Cloth 
ROBIN, SPARROW, BLUEJAY, CANARY Bird 95 Flower 
SI | BANDAGE, CHALK, MILK, RICE, SNOW White 92 | Red 
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curring for the second or third time. The 
E and C words were restricted to the second 
half of the list. Four different relations were 
represented by the E words and their cor- 
responding CS words. The E word was either 
an antonym (A; if the CS word appeared 
once prior to its E word and As if the CS 
word appeared three times prior to its E 
word), a converging word associate (CV), a 
superordinate (SO), or a sense impression 
(SI) of its corresponding CS word(s). Some 
E words had more than one CS word, and 
some CS words were repeated more than 
once prior to the appearance of their re- 
spective E word. The specific CS words, E 
words, and C words and the list positions of 
the E words are presented in Table 1. A 
given C word was always two positions re- 
moved from its corresponding E word: two 
positions preceding the E word on eight oc- 
casions and two positions following the E 
word on seven occasions, 


Procedure 


Each S was given three sheets of paper 
with numbers, each number followed by the 
words yes and no. The first page was num- 
bered from 1 to 10, the second page from 
1 to 50, and the third page from 51 to 100. 
The Ss were instructed that they would 
hear a list of words, and as they heard each 
word, they were to ask themselves if the 
word had appeared in the list before. If the 
word occurred for the first time they were 
to circle no, but if it had appeared in the 
list earlier, they were to circle yes. A 10-word 
practice list was read by E to aid Ss in un- 
derstanding the procedure. After the Ss 
corrected their practice list, the 100-word 
test list was presented by a magnetic tape 
recorder at a 10-second rate (10 seconds 
elapsed between the reading of each word). 
Each word was spoken twice in immediate 
succession, and the word number was in- 
dicated before every word. After the fiftieth 
word there was an additional 10-second 
pause, during which Ss were instructed to 
turn to the third page. 


Subjects 


The normal Ss were 31 introductory 
psychology students (13 males and 18 fe- 
males) attending the summer session at 
Northwestern University. This phase of the 
experiment was conducted in one group 
session. 

The retarded Ss were 76 residents of 
Dixon State School.? One problem present 


"The author is grateful to R. Metzger 
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in this study was a differential mortality 
rate between normals and retardates. The 
task proved to be difficult for the retardates 
as 36 of the original 76 Ss failed to com- 
prehend the instructions, thus, the analyses 
were based upon the responses of 40 re- 
tarded Ss (25 males and 15 females). The 
40 retarded Ss ranged in IQ from 39 to 80 
with a mean of 60.5, and they ranged in age 
from 12.0 to 19.6 years with a mean of 16.1. 
The experiment was conducted in eight 
group sessions with group size varying from 
6 to 15 Ss. 

The Ss were discarded when there were 
indications that they did not understand 
the nature of the task. In most cases the 
decisions were unequivocal. For example, 
Ss were discarded if they failed to respond, 
if they circled both alternatives, or if they 
gave stereotyped responses (eg. all yes 
responses). However, there was one criterion 
for discarding Ss which deserves further 
comment. By looking at an S's answer sheet 
it was possible to determine how many times 
he responded correctly, that is, circled yes 
when a word was repeated and circled no 
when a word appeared for the first time. 
The Ss responding at or below the chance 
level, that is, obtaining 50 or fewer correct 
responses, were discarded. There were 10 
such Ss, all from the retarded sample. The 
problem arises because it can not be stated 
with certainty whether these Ss did not 
understand the task or were just poor per- 
formers. If they were included in the analysis 
and some of them did not understand the 
task, then there would be an artificial re- 
duction in the difference between E word 
and C error rates among the retarded Ss. 
Risking this type of error is undesirable be- 
cause it works in the direction of the hy- 
pothesis. However, by discarding these Ss 
the error risk is that of discarding some Ss 
who actually understood the instructions, 
but were just poor performers. Since there 
was no reason to suscept that poor perform- 
ing retardates would behave more like nor- 
mals on this task than the better performing 
retardates, it was felt that this selection on 
a performance basis would not work in favor 
of the hypothesized interaction. 


REsULTS AND DISCUSSION 


For each S the total number of yes 
responses (false positives) was de- 


and the teachers at Dixon State School for 
their help and cooperation during this phase 
of the experiment. 
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termined for E words and C words. 
The means for retarded and normal 
Ss are presented in Figure 1. We may 
ask three questions of the data. First, 
considering only the college Ss, we 
may ask whether there were more false 
positives to E words than to C words. 
A paired-sample ¢ test yields a value 
of 4.31 (df = 29, p <.01). Second, 
we may ask the same question of re- 
tardates, that is, did they also falsely 
recognize E words more often than C 
words? A t of 2.51 (df = 38, p <.05) 
indicates that they did. Thus, the re- 
sults are consistent with the notion 
that an RR may be confused with an 
IAR. The third and most interesting 
question concerns the interaction. An 
analysis of variance indicated a sig- 
nificant interaction between Class of 
S X Type of Word (F = 5.17, df = 
1/69, p <.05). Since there were sev- 
eral zero scores, the analysis of vari- 
ance was repeated with each score 


= 


3.0 
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MEAN NUMBER OF FALSE RECOGNITIONS 


E WORDS 


RETARDED SUBJECTS 


C WORDS 
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transformed by adding .5 to it and 
then taking its square root, The trans- 
formation left the basic findings un- 
altered. Finally, the analysis was re- 
peated one additional time including 
the 10 retarded Ss which were dis- 
carded for failing to reach the 5096 
criterion. The inclusion of these Ss 
increased the magnitude of the inter- 
action (F = 9.62, df = 1/79, p <.01). 

The nature of the interaction may 
be seen in Figure 1. The absolute dif- 
ference in mean number of false posi- 
tives to E and C words for retardates 
appears considerably smaller than 
does the same comparison for college 
Ss. The interaction suggests that re- 
tardates may be responding with 
weaker or fewer IARs than normals, 
or that they show less RR-IAR con- 
fusion. The results are consistent with 
Wallace and Underwood (1964) and 
support the position that a character- 
istic of retarded Ss may be a defi- 


E WORDS C WORDS 


NORMAL SUBJECTS 


CONDITIONS 


f Fic. 1. Mean number of false positives to 
s. 


E words and C words for normal and retarded 
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TABLE 2 


E Won»-C Word DIFFERENCE IN ERROR 
Rares as A Function or Word Crass 


Word class 
Subjects e LED E rc 
A Aa cv | so SI 
Normals -F.01|4-.24|4-.19|-- .01)4-.05 
Retardates -00)+ .04|4- 07] 4-.12)-.02 


ciency in making IARs. It must be 
recognized that other differences be- 
tween normal and retarded Ss may be 
advanced to account for the results, 
but until a more compelling hypothesis 
is offered, it is suggested that differ- 
ences in IAR processes were critical to 
the resultant interaction. It should be 
pointed out that the interpretation 
rests upon the assumption that there is 
a reasonable consistency between nor- 
mals and retardates in their responses 
in word-association norms. A recent 
report by Gerjuoy and Gerjuoy (1965) 
renders this a feasible assumption as 
there at least appears to be a reason- 
able consistency in the ordering of 
Tesponses among normals and retar- 
dates, 

Table 2 presents the E word-C word 
difference in error rates for each class 
of words. The values in Table 2 were 
obtained for each group of Ss by sub- 
traeting the composite C-word error 
rates (.11 for normal Ss and .16 for 
retarded Ss) from the error rates for 
each class of E words. For the normal 
Ss the false positive rates as a func- 
tion of class of CS word-E word rela- 
tionship offered a partial replication 
of the Underwood (1965) study. As 
can be seen from Table 2 there were 
only small E word-C word differences 
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in error rates for Classes A, and SI, 
and there were substantial differences 
for Classes A; and CV. The one dis- 
crepancy from Underwood's data oc- 
eurred in the failure of the present 
study to demonstrate increased false 
positives to E words of the SO class. 
The E word-C word differences in er- 
ror rates for retardates were smaller 
than those for normals in all cases ex- 
cept Class SO. It is not clear why this 
one reversal occurred. 

Each S made 100 responses and ob- 
tained a score of total correct recogni- 
tions (circling yes to a repeated item 
and circling no to a first occurrence). 
College Ss had a mean of 91.2 correct 
recognitions, and retarded Ss had a 
mean of 84.6 correct recognitions. The 
difference between means was signifi- 
cant (t = 2.84, df = 69, p < 01). 
Even though S-selection biases served 
to remove the poorest performing re- 
tardates, the normal Ss still performed 
significantly better than the retardates 
on this task. 
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SECOND-ORDER ABILITY STRUCTURE REVEALED 


IN RIGHTS AND WRONGS SCORES. 


JOHN L. HORN an» WILLIAM J. BRAMBLE 
University of Denver 


Using 14 ability tests to measure 12 primary factors in a sample of 
106 adult males, 2nd-order factorial structures were determined sepa- 
rately for both rights scores and wrongs scores. In both analyses the 
results agreed well with previous analyses in showing intellective 
functions interpreted as fluid intelligence, crystallized intelligence, 
general visualization, and general fluency. Other factors were in- 
terpreted tentatively. The correlations between rights scores and 
wrongs scores for the same test were found to range from +.51 to —.98, 
with most being in a range indicating about 25% of variance in com- 
mon for the 2 kinds of scores. The results were interpreted as indicating 
that whereas the same basic functions were evidenced by the interrela- 
tionships among the 2 kinds of scores, decisions about the level of 
function of a particular individual could be quite disparate for rights 


as compared with wrongs scores. 


In recent studies by Cattell (1963), 
Horn (1965) and Horn and Cattell 
(1966a; 1966b; 1967) evidence was 
found to support a theory of fluid and 
crystallized intelligence. According to 
this theory the influences operating in 
development to produce what is gen- 
erally called “intelligence” produce 
not one general intellectual dimension, 
but two. Both of these involve basic 
processes of intelligence (perception 
of relations, eduction of correlates, 
span of apprehension, concept forma- 
tion, concept attainment, the use of 
generalized solution instruments, etc.) 
but in one, termed fluid intelligence 
(abbreviated Gf), the emphasis is on 
problem solving in the immediate situ- 
ation with test materials that are 
largely culture fair whereas the other, 
labeled crystallized intelligence (ab- 
breviated Ge), more nearly indicates 
the limits of acculturation. 

The tasks which define the Gf factor 
definitely require “intellectual work,” 
are not simply scanning and move- 
ment tasks, but the test materials are 


*This research was financed by Grant 
Number Ns G-518-618 from the National 
Aeronautics and Space Administration. 


such that for most of the people tested, 
the fundaments (i.e., the contents be- 
tween which relations can be said to 
exist) are about equally common or 
equally novel. In terms of the primary 
abilities analyzed in the Horn and 
Cattell studies with adults, the proc- 
esses and materials are best repre- 
sented by Induction (I-Letter Series?) , 
Figural Relations (CFR-Matrices), 
Associative Memory (Ma) and Fig- 
ural Classifications (CFR). Similarly, 
“intellectual work" is indicated by the 
tasks which define Ge, but in this case 
the work involved is not so much that 
of the immediate problem-solving task 
as it is that which would have occurred 
previously through intensive accul- 
turation to which only some, not all, of 
the people tested would have been ex- 
posed. The primaries which particu- 
larly characterized this function in the 
Horn and Cattell studies were Ver- 
bal Comprehension (V), Mechanical 
Knowledge (Mk), Numerical Facility 
(N) and Experiential Evaluation 


2The primary factor abbreviations em- 
ployed throughout this article were taken 
from French, Ekstrom, and Price (1963), if 
they list the factor, or otherwise from French 
(1951) or Guilford and Merrifield (1960). 
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(EMS). The fluid and crystallized 
dimensions were found to be highly co- 
operative in General Reasoning (R), 
Semantic Relations (CMR-Word 
Analogies) and Formal Reasoning 
(Rs-Syllogisms). 

Both of the above factors were 
clearly distinguished from a general 
visualization function (abbreviated 
Gv) and a general fluency dimension 
(abbreviated F). Gv involved all tasks 
which required work with spatial ma- 
terials. It included the I and CFR pri- 
maries of Gf, but was defined prin- 
cipally by Spatial Orientation (S), 
Visualization (Vz), Speed of Closure 
(Cs), Flexibility of Closure (Cf), and 
Perceptual Speed (P). The general 
fluency function was manifested in all 
tasks requiring scanning or produc- 
tion of conventional concept labels 
(words), as best represented by As- 
sociational Fluency (Fa), Ideational 
Fluency (Fi), and Word Fluency 
(Fw). 

Horn (1965) and Horn and Cattell 
1966b, 1967) reviewed  cross-sec- 
tional studies and produced new data 
to show that when putative measures 
of intelligence are classified as “pri- 
marily fluid,” “primarily crystallized,” 
and “about evenly mixed with these 
two,” much of the seemingly contra- 
dictory evidence on age differences in 
intelligence can be seen to be consist- 
ent. Younger adults were found to be 
superior in fluid functions; older adults 
were found to be superior in crystal- 
lized functions and no systematic age 
differences were found for primaries 
and omnibus tests which involved 
these two factors in about the same 
degree. 

This line of research thus shows 
promise of integrating previously di- 
verse bits of evidence in the field of 
human abilities. Perhaps more inter- 
esting, it shows some promise of fur- 
thering rapprochement among pre- 
viously separate, if not antagonistic, 
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subfields in which the principal con- 
cern is with questions about the proc- 
esses of problem solving, perception, 
ete. But this promise can be realized 
only if the results obtained heretofore 
are not specifie to either a particular 
sample of primary factors or a partic- 
ular sample of subjects. It is of con- 
siderable interest, therefore, to test the 
generality of the above results. The 
main purpose of the present study is 
to do this. 

Several years ago Fruchter (1950, 
1953) effectively demonstrated that 
wrongs scores obtained from time-limit 
tests can have reliabilities and validi- 
ties comparable to those for rights 
scores obtained from the same tests. 
More important, he showed that these 
results exist even when the correla- 
tions between rights and wrongs scores 
are quite low—some hovering near 
zero! In a factor analysis which in- 
cluded some rights-score and some 
wrongs-score variables he found a fac- 
tor defined exclusively by  wrongs 
scores. Horn (1965) replicated this 
finding and in his further analyses 
found that the carefulness factor (as 
Fruchter had labeled it) was largely 
independent of other factors even at 
the second and higher orders. 

'These findings thus suggest that 
quite different functions are measured 
with wrongs scores as compared with 
rights scores from timed ability tests. 

But there are at least two ways m 
which to interpret this last statement. 
According to one interpretation it 18 
evident from the fact of low correla- 
tion between comparable rights and 
wrongs scores that different functions 
are represented in these variables. For 
a low correlation means that a person 
who would be judged “good” in his 
performance on a given test when the 
rights score is used might be—that 1$; 
is nearly as likely to be—judged 
“poor” in his performance on this same 
test when the wrongs score is used. 
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Thus if "function" is thought of in 
terms of level of performance, it is in- 
deed evident that different functions 
are tapped by rights and wrongs 
Scores. 

However, one might think of “func- 
tion" not simply in terms of level of 
performance, but in terms of patterns 
of correlations of variables with all 
other variables with which they might 
be compared. It is possible for such 
patterns to be very similar for rights 
and wrongs scores even when the two 
kinds of scores correlate rather lowly. 
Horn and Cattell (1965) recently dis- 
cussed this possibility under the head- 
ing of vehicles in measurement. A 
vehiele is an individual differences 
characteristic through which other 
characteristics are expressed. Thus 
carefulness may be systematically ex- 
pressed in all of several kinds of abil- 
ities, but it need not be an influence 
which prevents the appearance of the 
factors representing these abilities 
when a properly rotated factor-ana- 
lytic solution is obtained. In other 
words, if carefulness is a unitary func- 
tion which affects all abilities in ap- 
proximately the same way relative to 
a rights-score function, then it is pos- 
sible for the same factors to appear in 
analyses of wrongs scores alone and 
in analyses of rights scores alone. 

With respect to these considerations, 
then, the present investigation was de- 
signed to: (a) replicate previous find- 
ings with a new sample of subjects and 
a somewhat different sample of vari- 
ables, and (b) see if the basic struc- 
ture is found both among rights-score 
and among wrongs-score variables. 


PROCEDURE 


Selection of Variables 


As in the previous studies of the fluid- 
crystallized concept, the French, Ekstrom, 
and Price (1963), French (1951), and Guil- 
ford and Merrifield (1960) collations of 
rotated factor results provided the basis for 
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selection of primary factor variables. To 
ensure some continuity with previous stud- 
ies, nine primary factors used in the Horn 
and Cattell (1966a) analyses were selected 
for the present investigation. These are the 
variables identified with asterisks in Table 
1, which provides basic descriptive data on 
all variables. One test was used to measure 
each primary factor. For P, Vz, Cf, and I 
the tests were entirely different from those 
used in the Horn-Cattell studies, although 
they represented the same primaries. The 
other tests were different only in the sense 
that they involved somewhat different items. 

Three new primary factors and two new 
tests, for which the factor structure was not 
known, were added to the core variables 
described above. The Aiming primary, (A), 
was included on the hypothesis that it would 
help define general visualization, while two 
memory primaries—Memory Span (Ms) 
and Memory for Designs (Md)—were in- 
eluded on hypotheses that a major portion 
of their variances would go to fluid intelli- 
gence. The Dominoes Test, originated by 
Anstey and developed by Vautrain (1954), 
was included on the assumption that it 
would have both general visualization and 
fluid intelligence variance, but mainly the 
latter. S. Mednick's (1962)* Remote Associa- 
tions Test (RAT) was included as a measure 
of creativity. Burt (1962) has pointed out 
that useful creativity (in contrast to the 
originality of the insane, the delirious, the 
day-dreamer, etc.) involves the noegenetic 
processes of intelligence, as most fully de- 
scribed in Spearman’s work. On this basis it 
was predicted that to the extent that the 
RAT indeed measures creativity, it, should 
have some variance on the fluid-intelligence 
factor. Intuitively it would seem to involve 
some variance on the general-fluency fac- 
tor as well. 


Subjects and Administration 


The subjects were 106 male inmates at 
the Colorado State Penitentiary“ The tests 
were administered in 10 parts on 10 occa- 


3' The authors are very grateful to 8, A. 
Mednick for making the materials for this 
test available to us prior to the publication, . 
by Science Research Associates, of a com- ^ 
mercial version of the RAT. 

4The authors thank George Levy, Senior 
Psychologist at the Colorado State Peniten- 
tiary, for his help in securing this sample, In 
this respect they are also grateful to Wayne 
Patterson, Warden, and Harry Tinsley, 
Director of Institutions. 
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Szconp-Orper ABILITY STRUCTURE 


TABLE 1 


Primary FACTOR VARIABLES, Tests Usep TO ESTIMATE THEM, AND MEANS 
AND STANDARD DEVIATIONS 


Primary factor Symbol Test used M Sigma 
1. Perceptual Speed «P Cancel Numbers 515.9 73.00 
2. Aiming A Dot in Circle 1145 386.0 
3. Visualization *Vz Follow the Line 425.6 102.8 
4. Flexibility of Closure *Cf Copy the Pattern 110.5 50.41 
5. Figural Relations *CFR | Figure Series 37.49 7.429 
6. Deduction D Dominoes 28.76 8.086 
7. Memory Span Ms Forward-Backward Digits| 434.6 188.6 
8. Memory for Designs Md Redraw the Figure 4.868 5.206 
9. Induction EL Letter Series 58.06 15.82 
10. Semantic Relations *CMR | Common Word Analogies 22.06 5.449 
11. Creativity RAT Remote Associations 86.79 32.75 
12. Associational Fluency *Fa Similar Words 66.74 29.63 
13. Ideational Fluency *Fi Things 103.9 40.78 
14. Verbal Comprehension "V General Information 36.12 7.683 
Word Variables 
15. Perceptual Speed P Cancel Numbers 21.67 16.48 
16. Aimi A Dot in Circle 413.5 334.4 
17. Visualization Vz Follow the Line 9.934 | 33.54 
18. Flexibility of Closure Ct Copy the Pattern 56.32 | 43.84 
19. Figural Relations CFR Figure Series 22.54 7.417 
20. Deduction D Dominoes 24.09 8.238 
21. Memory Span Ms Forward-Backward Digits| 45.36 | 43.89 
22. Induction I Letter Series 15.11 10.50 
23. Semantic Relations CMR | Common Word Analogies| 18.38 5.141 
24. Creativity RAT Remote Associations 19.65 14.57 
25. Associational Fluency Fa Controlled Associations} 41.65 22.81 
26. Ideational Fluency Fi Things 5.207 6.334 
27. Verbal Comprehension v General Information 22.31 6.709 


sions spaced over 5 days. There was one 
testing session in the morning, one in the 
afternoon on each day. A testing session 
lasted about 1% hours* The men were 
volunteers, They were paid $2.00 each for 
their services? The morale and motivation 
was judged to be very high throughout, al- 
though somewhat lower in the later sessions 
than in the earlier ones. 


ANALYSES AND RESULTS 


'The number of correct answers and 
the number of incorrect answers was 
determined for each part-test. A vari- 


®Marlan Wilson, Jr. directed the admin- 
istration and scoring of the tests. He was 
assisted by Messrs. Mathis and Conrad. The 
help of these men is very gratefully ac- 
knowledged. 

We were told that on the prison-to- 
street exchange, this is equivalent to about 
$10.00 to $15.00 outside the prison. 


able was then obtained as a simple 
(unweighted) sum of the raw scores on 
the 10-part tests, All variables, both 
rights and wrongs scores, were inter- 
correlated by the product-moment 
formula. The resulting table of inter- 
correlations has been deposited with 
ADI.' In Table 2 are shown the cor- 
relations between rights and wrongs 
scores for the same test. 

The submatrices involving rights 


* Material supplementary to this article 
has been deposited with the American Docu- 
mentation Institute. Order Document No. 
9274 from ADI Auxiliary Publications Proj- 
ect, Photoduplication Service, Library of 
Congress, Washington, D. C. 20540. Remit 
in advance $1.25 for microfilm or $1.25 for 
photocopies and make checks payable to: 
Chief, Photoduplication Service, Library of 
Congress. 
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TABLE 2 
CORRELATIONS BETWEEN RIGHTS AND WRoNas SCORES FOR THE Same Tust 
Variable symbol 
P A Vz Ct CFR D Ms I CMR | RAT | Fa | Fi v 
r [12 —.84| —.82| —.55 —.98| —.63| .30 —.64| —.98 —.40 .51| .40 —.92 


scores alone and wrongs scores alone 
were abstracted and factored sepa- 
rately. Algebraic independence was 
maintained among the variables of 
these two sets. In each set the squared 
multiple correlation of a variable with 
the other variables of the set was de- 
termined and inserted in the principal 
diagonal as a communality estimate. 
Principal axes factors were then ex- 
tracted until a latent root became zero. 
The number of factors to be rotated 
was further limited by selecting only 
those factors with roots larger than 
one which had at least one loading 
greater than .20 (absolute value)— 
that is, taking factors in order of size 
of latent roots, when a principal axis 
factor was encountered which did not 


have at least one loading greater than 
.20 (absolute value), that factor and 
all factors with smaller latent roots 
were eliminated in the rotations, The 
factors were rotated in accordance 
with the criteria of Kaiser’s (1958) 
varimax procedure. 

The rotated results from the factor 
analyses are shown in Tables 3 and 4. 
The squared multiple correlation com- 
munalities are shown in the next-to- 
last column in each of these tables; 
the communalities based on factor 
loadings are shown at the far right. 


Discussion 


The correlations of Table 2 suggest 
that when a task is highly speeded 
but conceptually simple, (Canceling, 


TABLE 3 
RorATED Factors FROM ANALYSIS OF RIGHTS SCORES 


Second order factors 

Symbol: rimary SRS QUIC US LG RIS du t M 
s n I| a EI | Iv v VI |SMR| M 
P Circle Numb 11]01| 50|12| 09 | —06 | 31 | 39 
A AE POM i: 18|11| 50|18| 16| 41|45|52 
Va Visualization 25 |07| 67|08| 18| 23 n E. 
Cf Flexibility of Closure 45|28| 37|24| 32| 832 E a 
CFR Figural Series 61/03 | 31/10] 35) 18 : 
D Deduction 69|38| 25|10| 13| 04/65|7 
Ms Memory Span 12|04| 12|11| 49) 01|23 si 
Md Memory Designs 22|09| 18|11| 53| 39 S = 
I Letter Series 57 |42| 22 27| 40 rd inm 
CMR Common Word Analogies 39 |51| 16/16) 48 i 
RAT Creativity 25 | 61 | —01 | 15 | —06| 10 | 42 a 
Fa Controlled Associations 16|41| 15|70| 35|—04|76 4 
Fi Things 13|28| 24|74| 09| 18 o 2 
v General Information 05|75| 07|28| 18| Oi 


Note.—Decimal points have been omitted in the factor coefficients in order to simplify 


the presentation. 
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TABLE 4 
ROTATED FACTORS FROM ANALYsIs OF WRONGS SCORES 
Second order factors 

Symbols Primary factors 

I I n | w V |sMR| # 
P Circle Numbers 10 11 58 | —06 39 | 45 | 52 
A Aiming 16| 03| 56|—02| —04 | 32 | 35 
Vz Visualization 14 04 65 | —03 07 | 37 | 45 
Cf Flexibility of Closure 46 33 28 | —04 43 | 55 | 59 
CFR Figural Series 55 | —11 31 | —18 25 | 45 | 51 
D Deduction 69 31 15 13 03 | 60 | 62 
Ms Memory Span 25 17 | —11 22 03 | 15 | 15 
I Letter Series 73| 22| 32| 05] —11 | 63 | 69 
CMR Common Word Analogies 61 49 19 | —15 24 | 66 | 73 
RAT Creativity 14 32 | —15 61 | —26 | 47 | 58 
Fa Controlled Associations 07 | —09 | —05 72 07 | 45 | 54 
Fi Things —13 | —11 06 61 | —04 | 33 | 40 
v General Information 26 70 10 | —06 04 | 50 | 57 


Note.—Decimal points have been omitted in the factor coefficients in order to simplify 


the presentation. 


for example), the number of errors 
correlates positively with the number 
of correct responses. 

In the Canceling test the wrongs 
score is literally an error score, but 
this is not quite true for the other three 
tests for which there is positive corre- 
lation between rights and wrongs 
scores. In Memory for Digits the sub- 
ject was required to recall the digits in 
the order in which they were given (in 
half the items) or in reverse order. If 
any one of the digits given in an 
answer was incorrect or out of order, 
the response on that item was scored 
wrong. In fluency tests (Similar Words 
and Things) a response was judged 
“wrong” if it seemed to be a quite 
bizarre, irrelevant association. For ex- 
ample, the response “clip” to the key 
word "warm" was scored wrong. In 
some cases, of course, one could legiti- 
mately question the decision that a 
particular response was bizarre or ir- 
relevant. In this connection it is per- 
haps worth pointing out that the cor- 
relations between RAT (a measure of 
creativity) and the wrongs scores for 
Fa and Fi were —.02 and —.06 re- 


spectively—that is, certainly low 
enough to suggest that relevant but 
remote associations were not being 
counted in the wrongs scores for Fa 
and Fi. 

The correlations between wrongs 
and rights scores are negative and very 
high for Figure Series, Common Word 
Analogies, and General Information. 
This indicates that these measure- 
ments were obtained under almost 
power conditions, as was the aim in 
this research. Had all subjects at- 
tempted all items in these tests, the 
correlations between rights and wrongs 
scores for the same test would neces- 
sarily be —1.0. The departure of the 
correlations from —1.0 is thus a rough 
indication of the speededness of test 
administration. 

The correlations between rights and 
wrongs scores for A, Vz, Cf, D, 1, 
RAT, Fa, and Fi indicate that the two 
kinds of scores have about 25% of 
their variance in common. This is cer- 
tainly low enough to suggest that 
rights and wrongs scores represent 
somewhat distinct functions. It is 
therefore particularly interesting that 
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the factorial results for the analyses of 
these two kinds of scores are so simi- 
lar. Although there are differences in 
detail the same essential functions, 
replicating previous results, can be 
identified in both analyses. The first 
four factors are easily identified as the 
fluid intelligence, crystallized intelli- 
gence, general visualization, and gen- 
eral fluency dimensions previously 
identified by Horn and Cattell 
(1966a). 

The fifth factor in the rights score 
analysis is characterized mainly by the 
memory tests but involves substantial 
loadings on CMR, I, CFR, Fa, and 
Cf. Each of these tests requires that 
the subject hold a relation or several 
relations in immediate awareness in 
order to compare it with other rela- 
tions. This suggests a link between this 
factor and the process which Spear- 
man first described as span of appre- 
hension or what has been discussed in 
recent years under the heading of tem- 
poral integration. 

Factor VI in the rights score analy- 
sis represents a visualization function 
but in this case, in contrast to Factor 
III, Memory for Designs is involved. 
There is thus a suggestion that visual 
retention, as represented in Factor VI, 
is somewhat distinct from the visual 
“fluency” represented in Factor III. 

The simple structure in the analysis 
of wrongs scores is somewhat better 
than in the rights score solution: yet 
the salient factor coefficients tend to 
be of about the same magnitude. The 
suggestion is that analysis of wrongs 
scores yields a “cleaner” structure than 
analysis of rights scores. It is of some 
interest to compare the differences in 
loadings for the comparable salient 
variables in the rights-score and 
wrongs-score analyses, but it is prob- 
ably not worthwhile to occupy space 
with a detailed discussion of these at 
this time, 
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It is perhaps worth noting, however, 
that Remote Associations in the rights- 
score analysis has a loading of only 
.15 on the general-fluency dimension, 
whereas in the wrongs-score analysis 
the comparable loading is .61. General 
fluency is interpreted as a facility in 
bringing concept labels (words) from a 
long-term storage center into im- 
mediate awareness. The results ob- 
tained here suggest that this facility is 
not characterized by criticality. That 
is, the concepts implied need not be 
related in a precise and logical man- 
ner, for this is the requirement im- 
posed on associations in the rights 
score for RAT and this variable does 
not help to define F, whereas this re- 
strietion is removed in the wrongs 
score for RAT and in this case the 
variable does help to define F. — 

Factor V in the wrongs-score anal- 
ysis does not match any factor found 
in the rights-score analysis or in our 
previous work. It is not readily in- 
terpreted. We will therefore hold our 
attempt at identifying it until after 
we find that it can be replicated. 

Something of a paradox is thus 
posed by these results. When a test is 
scored by the wrongs-score procedure, 
the person who adopts a strategy of 
avoiding errors has the advantage; 
when the same test is scored by the 
rights-score procedure, the person who 
adopts a strategy of getting as many 
right as possible, even at the cost of a 
few errors, has the advantage. The 
same basic intellectual functions are 
indicated by analyses of both kinds of 
scores, This means that regardless of 
strategy adopted, knowing how a sub- 
ject performs on one kind of task in a 
factor pattern, we can predict with 
considerably better than chance ac- 
curacy how he will perform in all 
other tasks defining the pattern. Yet, 
and here is the paradox, persons who 
are rated high in a particular func- 
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tion when this is defined in terms of 
rights scores will rather frequently (as 
suggested by correlations of the order 
of .5) be rated comparatively low in 
this same function when it is defined 
in terms of wrongs scores (that is, a 
constant minus number of wrongs, to 
define the variable so that a high score 
indicates good performance). 

This is indeed an interesting finding 
and one which has practical as well as 
theoretical implications. It is to be 
hoped that follow-up studies now in 
progress will shed additional light on 
the questions here provoked.? 
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ERRATUM 


On page 332 of "Structure of Intelligence in Negro and White Children" by 
Ira J. Semler and Ira Iscoe in the December 1966 issue the last sentence in 
the section WISC Structure: Multivariate Analysis of Variance and Explora- 
tory Factor Analyses should read: Factor II for the white sample was a com- 
plex factor involving loadings of the WISC A, DS, BD, and C subtests. 
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DISCUSSION VERSUS MEMORY IN COOPERATIVE 
GROUP CONCEPT ATTAINMENT 


PATRICK R. LAUGHLIN 4x» MARY AUSTIN DOHERTY 
Loyola University 


In order to determine the relative importance of discussion vs. memory 
in group-concept attainment, the performance of 2-person cooperative 
female groups was investigated in 5 concept-attainment problems. A 
2 X 2 X 2 X 2 factorial design was used with the variables: (a) dis- 
cussion (allowed or not allowed), (b) memory (paper allowed or not 
allowed), (c) stimulus display (form or sequence), (d) number of rele- 
vant problem attributes (2 or 4). Discussion resulted in fewer card 
choices to solution, lower percentage of untenable hypotheses, and more 
time to solution, while memory had no effect. There were no main 
effects for the use of focusing or scanning strategies, although both 
strategies showed similar complex relationships in the interactions of 


the 4 variables. 


In a study of concept attainment by 
individuals versus two-person groups, 
Laughlin (1965) found that groups 
required fewer card choices to solu- 
tion and used the focusing strategy 
more than individuals. Although this 
group superiority in card choices was 
not found using the Taylor-McNemar 
(1955) correction model, the greater 
use of focusing by groups remained 
even after the correction. Focusing 
Strategy is distinguished from scan- 
ning as one of two basic problem-solv- 
ing processes adopted by subjects in 
a concept-attainment situation by 
Bruner, Goodnow, and Austin (1956). 
In focusing the subject tests the rele- 
vance of all the possible hypotheses 
involved in a particular attribute or 
attributes by choosing a card differing 
in one (conservative focusing) or more 
(focus gambling) attributes from a 
positive focus card. In scanning he 
tests specific hypotheses, either singly 
(successive scanning), all at once (si- 
multaneous scanning), or some inter- 
mediate number. In general, focusing 
is a more effective strategy, which 
Bruner et al. (1956) interpret as due 
to the greater inference and memory 
requirements of scanning. 

Since memory and inference require- 
ments seem to be two basic elements 


of a concept-attainment task, group 
superiority may be due to either better 
memory, better inference, or both. The 
important effects of memory in con- 
cept learning have been demonstrated 
in experiments by Hovland and Weiss 

(1953), Cahill and Hovland (1960), 
Hunt (1961), Bourne, Goldstein, and 
Link (1964), and extensively reviewed 
by Dominowski (1965). Thus, the 
superiority of groups may have been 
due to their better memory. An alter- 
native explanation involves the infer- 
ence or logie requirements of the con- 
cept-attainment problem, in which 
the discussion process between the in- 
dividuals may have enabled them to 
evaluate the information on each card 
choice, reason concerning the meaning 
of positive and negative instances, 
eliminate errors in inference by moni- 
toring each other’s card choices and 
hypotheses, and so forth. These proc- 
esses can be summarized as the ef- 
fects of discussion as opposed to mem- 
ory. 

"Thus, in order to test these two al- 
ternative explanations of memory and 
discussion for the superiority of groups 
over individuals, the present study 
compared two-person cooperative 
groups in which paper (memory) and 
discussion were or were not allowed in 
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a factorial design. In order to extend 
and replicate previous research 
(Laughlin, 1965, 1966) two additional 
variables were included: (a) form 
versus sequence stimulus displays, (b) 
two-attribute versus four-attribute 
problems. Finally, the subjects were 
females rather than the males of pre- 
vious research. 


METHOD 


Design 


A 2X 2 X 2 X 2 factorial design was 
used with the following variables: (a) dis- 
cussion (allowed or not allowed), (b) mem- 
ory (paper allowed or not allowed), (c) 
stimulus display (form or sequence), (d) 
number of relevant attributes (two to four). 


Subjects 


The subjects were 192 female college 
students at Alverno College, Milwaukee. 
Six pairs were randomly assigned to each of 
the 16 experimental conditions, 


Stimulus Displays and Problems 


The problem materials were six 28-inch 
X 44-inch white poster boards, each con- 
taining an 8 X 8 array of 64 2¥%-inch by 4- 
inch cards drawn in colored ink with dark 
outlines, The 64 cards represented all the 
possible combinations of six attributes with 
two levels of each attribute. Form displays 
consisted of the following attributes and 
values: (a) shape: square or triangle, (b) 
size: large or small, (c) number: one or 
two, (d) color: red or green, (e) pattern: 
Striped or solid, (f) borders: one or two. 
Sequence displays consisted of all combi- 
nations of six plus and/or minus Signs in a 
row. In order to facilitate reference to the 
Six positions, each was a different color, so 
that the color name was the attribute and 
plus or minus the value of each color. The 
attributes and values were listed on refer- 
ence cards which the subjects could use 
throughout the experiment. Three different 
ordered displays, in which the position of 
each card varied systematically in relation 
to the other cards (eg, all red figures on 
the top four rows), were used for both 
form and sequence stimulus displays. Each 
problem and initial card for each pair of 
subjects were randomly selected from the 
total subsets of possible two-attribute or 
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four-attribute concepts. All pairs solved five 
problems. 


Procedure 

The instructions demonstrated the mean- 
ing of conjunctive concepts, pointed out 
the attributes and values on the stimulus 
display, explained that the problems all 
involved two (or four) relevant attributes, 
and emphasized that they were to be solved 
in as few card choices as possible, regard- 
less of time (Laughlin, 1964). It was empha- 
sized in all conditions that the two persons 
were not competing in any way, but would 
be scored as a cooperative group and com- 
pared with other groups. In the no-discus- 
sion conditions either of the two persons 
could make each card choice and hypothe- 
sis, and the two persons were seated ad- 
jacently so that they saw and heard both 
each other and the experimenter, but could 
not speak with each other. They were in- 
structed to listen carefully to each other's 
card choices and hypotheses, and to attempt 
to utilize this information. 


RESULTS 


The means for the 16 treatment 
groups for (a) number of card choices 
to solution, (b) percentage repetitions 
of hypotheses, (c) percentage of un- 
tenable hypotheses, (d) time to solu- 
tion, (e) focusing strategy, and (f) 
scanning strategy are given in Table 1. 
All measures are for totals over the 
five problems, and totals over both 
subjects regardless of which subject 
made the card choice or hypothesis. 


Card Choices to Solution 


Two-person groups who were al- 
lowed discussion had fewer card 
choices to solution than those not 
allowed discussion, F (1, 80) = 6.91, 
p < .02. There was no difference be- 
tween those who used paper and those 
who did not, F < 1. Likewise, the 
effects of stimulus display and num- 
ber of problem attributes were not 
significant, F (1, 80) < 1; F (1, 80) 
= 2.54, respectively. The Stimulus 
Display x Attributes interaction was 
significant, F (1, 80) = 4.02, p < .05, 


| 
| 


em 


COOPERATIVE GROUP CONCEPT ATTAINMENT 125 


TABLE 1 
Meran NuMBER or Carp CHOICES TO SOLUTION, PERCENTAGE REPETITION OF 
HYPOTHESES, PERCENTAGE UNTENABLE Hyroruesss, TIME TO SOLUTION, 
FocusING STRATEGY, AND SCANNING STRATEGY, OVER Five PROBLEMS 


Paper No paper 
Form Sequence Form Sequence 
Two Four Two Four Two Four Two Four 

Discussion 

Card choices 18.33 | 22.50 | 20.67 | 21.67 | 28.50 | 22.67 | 15.83 | 24.67 

Repetition hypotheses .05 .08 .01 .08 08 .07 .04 .12 

Untenable hypotheses 17 .39 -16 .21 31 .26 .20 .30 

Time to solution 20.8 | 31.3 | 16.7 | 22.8 | 20.2 | 43.3 | 19.8 | 31.5 

Focusing strategy ' 3.56 | 2.42 | 2.45 | 2.30| 2.81| 3.21] 3.25 | 2.55 

Scanning strategy 21.76 | 22.32 | 16.94 | 19.06 | 16.12 | 18.79 | 25.01 | 17.13 
No discussion 

Card choices 30.33 | 25.83 | 23.50 | 28.67 | 21.50 | 25.33 | 22.00 | 27.33 

Repetition hypotheses .07 .06 .06 .12 .07 .19 .22 .09 

Untenable hypotheses 44 -39 .23 .31 .24 .48 .28 -49 

Time to solution 15.5 |27.7 |13.2 | 26.2 | 14.0 |23.3 | 13.2 | 20.5 

Focusing strategy 2.25 | 2.61 | 2.72| 2.61] 3.11| 2.39 | 2.48 | 1.82 

Scanning strategy 16.72 | 16.78 | 18.30 | 18.96 | 18.96 | 18.51 | 18.01 | 16.77 
————— Án 7°25 baee nii] is aL RS S adi od td 


Note.—Time in minutes. Maximum focusing strategy is 5.00. 


so that two-attribute and four-attri- Percentage Untenable Hypotheses 
bute form problems showed little dif- Untenable hypotheses were of two 
ference, while two-attribute sequence types: (a) a hypothesis for a value of 
problems were solved in fewer card an attribute when the other value of 
choices than four-attribute sequence the attribute had previously occurred 
problems. on a positive instance, for example, the 
hypothesis “red square” when a green 
- Percentage Repetitions of H ypotheses instance had, heen. positive; ©) a hy- 
; othesis which had previously oc- 
par or a to priis for qe nin on a negative instance, for ex- 
umbers of hypotheses ope d Nn T ample, the hypothesis "red square" 
tions, the percentage of repetitions of hen an instance, with a. red square 
hypotheses was determined by divid- had been negative. In order to con- 
ing the number of repetitions of hy- trol for the different number of hy- 
potheses by the number of hypotheses. potheses across conditions, the number 
Both the effects of discussion and of untenable hypotheses was divided 
| Memory were significant. Discussion by the number of hypotheses. Dis- 
groups had a lower percentage of repe- cussion groups had a lower percentage 
titions of hypotheses than no-diseus- of untenable hypotheses than no-dis- 
Sion groups, F (1,80) = 4.13, p < .05; cussion groups, P (1, 80) = 12.32, p 
and paper groups had a lower per- < .001; paper and no-paper groups 
centage than no-paper groups, F (1, did not differ, F (1, 80) = 1.05; form 
a 80) = 3.97, p < .05. The effects of displays had a lower percentage than 
stimulus display and number of prob- sequence displays, F. (1, 80) = 4.43, 
lem attributes were not significant. p « .05; and two-attribute problems 


126 


had a lower percentage than four-at- 
tribute problems, F (1, 80) — 10.60, 
p « .001. 


Time to Solution 


Discussion groups required more 
time to solution than no-discussion 
groups, F (1, 80) = 7.13, p < .01; 
there was no difference between paper 
and no-paper groups, F (1, 80) < 1; 
there was no difference between form 
and sequence displays, F (1, 80) = 
2.74; and four-attribute problems re- 
quired more time than two-attribute, 
F (1, 80) = 22.57, p < .001. 


Focusing Strategy 


Focusing strategy was scored ac- 
cording to two rules. Rule 1: Each 
card choice had to obtain information 
on one new attribute. New information 
was obtained if the card choice altered 
only one attribute not previously 
proven irrelevant (conservative focus- 
ing), or, if more than one attribute was 
altered (focus gambling), the instance 
was either positive or the ambiguous 
information correctly resolved on the 
next card by altering only one at- 
tribute. Rule 2: If a hypothesis was 
made it had to be tenable considering 
the information available. Each card 
choice and accompanying hypothesis 
that satisfied these two rules was 
counted as an instance of focusing, 
and the total number of such instances 
was divided by the total number of 
card choices on the problem to give a 
continuous focusing score from .00 to 
1.00. 

None of the four main effects of dis- 
cussion, memory, stimulus display, or 
attributes were significant, F (1, 80) 
= 2.18; F < 1; F <1; F = 2.52, 
respectively. The Discussion X Mem- 
ory X Stimulus Display interaction 
was significant at the .05 level, F (1, 
80) — 5.18. With discussion and paper 
there was more focusing with form 
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displays than with sequence displays, 
while with discussion and without 
paper there was no difference between 
form and sequence displays. Without 
discussion the relationships were re- 
versed, so that with paper there was 
no difference between form displays 
and sequence displays, while without 
paper there was more focusing with 
form displays. The Discussion x 
Memory X Attributes interaction was 
also significant at the .05 level, F 
(1, 80) = 4.97, and showed the same 
relationships. With discussion and 
paper there was more focusing with 
two-attribute problems than with 
four-attribute, while without paper 
there was no difference between two- 
attribute and four-attribute problems. 
Without discussion the relationships 
were reversed, so that with paper there 
was no difference between two-attri- 
bute and four-attribute problems, 
while without paper there was more 
focusing with two-attribute problems. 
Finally, the four-way interaction was 
significant, F (1, 80) = 4.88, p < .05. 


Scanning Strategy 


Scanning strategy was scored by 
comparing each card in turn with the 
given problem card. If the selected 
card was positive, all concepts differ- 
ent on the given and selected cards 
were eliminated; if the selected card 
was negative, all concepts identical on 
the given and selected cards were 
eliminated. The total of the number 
of concepts thus eliminated plus those 
concepts eliminated by direct hy- 
potheses was then divided by the total 
number of card choices on the problem 
in order to determine the average 
number of concepts eliminated per 
card choice. This measure was con- 
sidered an index of scanning. 

There was only a slight trend for 
discussion groups to have higher scan- 
ning scores at the .10 level, F (1,80) = 
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TABLE 2 
INTERCORRELATIONS OF RESPONSE 
MEASURES 

Card p et p ica 
dices i. Peer iei nt. 
eses eses 
Repetition -30 
hypotheses 
Untenable -52| .24 
hypotheses 
Time to so- |—.33| .12| .25| 
lution 
Focusing —.54 |-.12|— .23|— .02] 
strategy 
Scanning —.68|—.14|—.14|—-.M| .51 
strategy 


2.68. The effects of memory, stimulus 
display, and attributes were all non- 
significant, all Fs < 1. The Discussion 
X Memory x Stimulus Display inter- 
action was significant, F(1, 80) = 
6.38, p < .02. The relationships were 
parallel to those for focusing but more 
pronounced. With discussion and pa- 
per there was more scanning with form 
displays than with sequence displays, 
while without paper there was more 
Scanning with sequence displays. 
Without diseussion the relationships 
Were reversed, so that with paper 
there was more scanning with se- 
quence displays, while without paper 
there was more scanning with form 
displays. 


Correlations between Response Meas- 
ures 


Correlations between the six re- 
Sponse measures are given in Table 2. 


Discussion 


The major purpose of the experi- 
ment was to assess the relative im- 
Portance of discussion and memory 
as potential factors in the previously 
demonstrated superiority of group 
Over individual concept attainment. 


In terms of the basic dependent varia- 
ble, number of card choices to solu- 
tion as an indicator of efficiency of 
performance, discussion was clearly 
the more important factor, as discus- 
sion groups had fewer card choices 
than no-discussion groups while there 
was no difference for paper and no- 
paper groups. The same relationship 
was found for percentage of untenable 
hypotheses. Basically, the benefits of 
discussion seem to involve an infer- 
ence and monitoring process, in which 
the two persons can both reason con- 
cerning the meaning of each card 
choice and hypothesis and check each 
other. Through discussion they can 
reduce erroneous inference and insure 
more efficient card choices, thus solv- 
ing the problems in fewer card choices 
and making fewer untenable hypothe- 
ses. However, discussion groups re- 
quired more time than no-discussion 
groups, while there was no difference 
between paper and no-paper groups. 
This result parallels the greater time 
required by groups than by individ- 
uals (Laughlin, 1965). 

Discussion and memory did not re- 
sult in significantly different use of 
either focusing or scanning strategies, 
although there were trends favoring 
discussion groups over no-discussion 
groups on both strategies. The differ- 
ences in card choices and untenable 
hypotheses as a function of discussion, 
the lack of difference as a function of 
memory, the high correlation between 
focusing and scanning, and the paral- 
lel interactions between discussion, 
memory, and display for the two 
strategies seem to indicate the follow- 
ing overall conclusion: Discussion re- 
sults in more effective problem solving 
rather than specific changes in the use 
of focusing or scanning, while memory 
has no effect. In other words, whether 
focusing or scanning was used, dis- 
cussion enabled it to be used more 
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effectively while paper made no differ- 
ence. 

Contrary to previous research 
(Bourne, 1957, 1966; Bourne & Hay- 
good, 1958; Laughlin, 1965; Shepard, 
Hovland, & Jenkins, 1961) , there were 
no main effects indicating an increase 
in number of card choices with in- 
creased task complexity for four-at- 
tribute concepts versus two-attribute 
or sequence versus form displays. 
However, the Display X Attributes 
interaction was significant, so that 
there was no difference between two- 
attribute and four-attribute form 
problems, while four-attribute se- 
quence problems required more card 
choices than two-attribute sequence 
problems. Thus, the combination of 
both the more complex sequence dis- 
play and four-attribute problems re- 
sulted in an increased number of card 
choices although neither displays nor 
number of attributes alone resulted in 
a complexity effect. Since previous re- 
search has involved individuals work- 
ing alone (except group conditions in 
Laughlin, 1965), the inereased com- 
plexity necessary to demonstrate in- 
creased card choices may indicate a 
social facilitation effect from the 
mere presence of the other person, 
apart from the effects of discussion 
(Zajone, 1965). 
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Introducing an Original Paperbound Series... 
CURRENT ISSUES AND RESEARCH 
IN EDUCATION 


General Editor: Harry L. Miller, Hunter College 

The Current Issues and Research in Education series is an innovation that, in the 
struggle to keep up with the “information explosion” in the rapidly expand- 
ing field of education, encompasses and illuminates a wide range of new 
ideas, experimentation, and criticism. Each volume contains excerpts, sum- 
maries, and entire articles that survey the most recent research findings, 
comment on persistent issues, and evaluate ongoing experiments and new 
ideas for the future. The editors’ introductions provide the necessary his- 
torical background and underscore the relationships of the articles to general 
trends in the field. Throughout, special emphasis is placed on the significance 
of these issues for teachers in today’s schools. 


Volumes now available 


THE PSYCHOLOGY ELEMENTARY 

OF EDUCATION EDUCATION 

Edited by Donald H. Clark, Edited hy Maurie Hillson, 

Hunter College Rutgers University 
EDUCATION FOR THE SOCIAL FOUNDATIONS 
DISADVANTAGED OF EDUCATION 

Edited by Harry L. Miller, Edited by Dorothy Westby-Gibson, 
Hunter College San Francisco State College 


Just published All volumes $2.95 each 


BEHAVIOR IN INFANCY AND 
EARLY CHILDHOOD 


A Book of Readings Edited by Yvonne Brackbill, University of 
Denver, and George G. Thompson, Ohio State University (A.V 
Presenting the work of more than sixty of the world's leading authorities in 
the field, this comprehensive reader examines every aspect of development 
during infancy and early childhood—from the psychological to the social. 
Ranging from established classics to the first English translations of pioneer- 
ing Russian studies, the selections present the full scope of contemporary 
research and are organized to encourage comparative examination. 

Just published 704 pages, illus. $9.95 


Available at your bookstore or directly from... 
THE FREE PRESS 
A Division of The Macmillan Company 
866 Third Avenue, New York 10022 
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nine ways to enrich your understanding 


H 


EDUCATIONAL PSYCHOLOGY 
IN THE CLASSROOM 


Third Edition 


By HENRY CLAY LINDGREN, San Francisco State College. Focusing on 
the teacher as a behavioral scientist, the new edition of this successful book 
is concerned mainly with the ongoing classroom situation. New features 
include a unique chapter on the slum child; strong emphasis on the contribu- 
tions of social psychology; a comprehensive treatment of field theory and 
its application to educational practices and problems; inclusion of such 
relatively-ignored topics as social learning theory, creativity, and needs 
for affiliation and achievement. 1967. 686 pages. $8.50 


INTRODUCTION TO STATISTICAL 
ANALYSIS AND INFERENCE 


For Psychology and Education 


By SIDNEY J. ARMORE, George Washington University. Provides a secure 
foundation in both descriptive statistics and statistical inference. All topics 
are supported by detailed explanations, illustrative problems, graphs and 
charts, anda review of algebraic symbols and operations is set forth 
in an appendix. 1966. 546 pages. $8.95. 


BEHAVIORAL SCIENCE FRONTIERS 
IN EDUCATION 


Edited by ELI M. BOWER, National Institute of Mental Health, Bethesda, 
Maryland; and WILLIAM G. HOLLISTER, University of North Carolina School of 
Medicine. Prominent behavioral scientists—theorists and practitioners—set 
forth the germinal ideas and Programs which they represent. 1967. Approx. 
544 pages. Prob. $8.95. 


JOHN WILEY & SONS, Inc. 
605 Third Avenue, New York, N.Y. 10016 
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GROUPING IN EDUCATION 


Edited by ALFRED YATES. Sponsored by the UNESCO Institute for Educa- 
tion, Hamburg, Germany. Describes and evaluates commonly-practiced 
grouping procedures, and presents accounts of research into grouping 
practices in the form of four special articles and fifty research abstracts. 
1966. 314 pages. $1275. 


CLASSROOM GROUPING FOR TEACHABILITY 


By HERBERT A. THELEN, University of Chicago; with the assistance 
of HENRY PETERSON, ALAN OPPENHEIM, WILLIAM HOOCK, STEVEN 
PERLS, and ANNE BRODY. 1967. Approx. 288 pages. $7.50. 


READINGS IN THE PSYCHOLOGY OF 
PARENT-CHILD RELATIONS 

Edited by GENE R. MEDINNUS, San Jose State College. 1966. 371 
pages. $4.50. 


STUDIES IN COGNITIVE GROWTH 


By JEROME S. BRUNER, Harvard University; with ROSE R. OLVER, Amherst 
College; PATRICIA M. GREENFIELD et al. 1966. 343 pages. $7.95. 


THE COMPUTER IN AMERICAN EDUCATION 

By DON D. BUSHNELL, Brooks Foundation, Santa Barbara; and DWIGHT 
W. ALLEN, Stanford University. 1967. Approx. 256 pages. Cloth: Prob. 
$6.95. Paper: Prob. $3.95. 


INTERNATIONAL STUDY OF crear pre obi 
IN MATHEMATICS 


A Comparison of Twelve i car 


Volumes I and II 
Chairman and Editor: TORSTEN HUSEN, University of Stockholm. 1967. 
Prob. $19.95 the set. In press. 


JOHN WILEY & SONS, Inc. 
605 Third Avenue, New York, N.Y. 10016 
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Pretty Painful 


But it doesn't have to be incurable. 


Not when a special program like SRA's 
READING IN HIGH GEAR has helped func- 
tionally illiterate individuals to read at sixth- 
to-eighth grade levels with 150 to 300 hours 
of instruction.* 


READING IN HIGH GEAR is a modern pro- 
gram particularly applicable for helping 
mature semi-literate and illiterate students 
learn the basic communicative skills which 
they need to make their lives more useful. 


The stories used in the program are planned 
to be interesting to the student — relevant to 


his own experiences and his own age group. 


The story themes are such that they can 
help develop positive social attitudes and a 
healthy orientation to society. 


READING IN HIGH GEAR is a versatile pro- 
gram, and can be effectively adapted for use 
with remedial students and pre-dropouts. 


For many people, learning to read can be the 
all-important first step toward a better life. 
Write for a detailed booklet or ask your near- 
est SRA Staff Associate to tell you more 
about READING IN HIGH GEAR. 


*In terms of word-attack skills. Generally, 
grade levels are not assigned in the pro- 
gram because of the uniqueness ofthe 
learning technique. 


Science Research Associates, Inc. 
259 E. Erie St., Chicago, IIl. 60611 


A Subsidiary of IBM 
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How do you measure the value of a textbook? 


COMPREHENSIVENESS —— 
ee 
H DOCUMENTATION — 


Does the psychology text you are presently using 
measure up as well as this outstanding new textbook? 


Heckel-Peacock 
TEXTBOOK OF GENERAL PSYCHOLOGY 


Designed for college and university courses in “General Psychology”, this new text 
Provides a broad overview of psychology at the introductory level. Comprehensive 
and thoroughly documented, its well-planned flexible format is sufficiently detailed 
to be used without supplementation for the usual one quarter or one semester course. 
Yet, it is flexible enough to be expanded to your specific requirements with reprints 
from the Bobbs-Merrill Series or from SCIENTIFIC AMERICAN. 

Logically organized into six analytical sections, it introduces biological and develop- 
mental topics early in the presentation and emphasizes the developmental, physiologi- 
cal and historical areas, You will find that it provides a more expansive treatment of 
the physiological, sensory, group and developmental areas than other introductory textbooks. 
The easily understood discussions examine all essential aspects of modern psychology in 
depth with analogies to everyday behavior and experience. 

A complimentary Teacher’s Guide and Test Booklet is provided each instructor adopt- 
ing this new book. Examine a copy of TEXTBOOK OF GENERAL PSYCHOLOGY 
and compare it to the text you are presently using. You may find that it contains more 
of the essential features you require for your introductory courses. 

b al Training, Uni- 
WIN T MERE Du emet aee rr DE S AA 
of Psychology, University of Georgia, Athens, Georgia. Contributing author: ROBERT E. 


McCARTER, Ph.D., University of South Carolina, Columbia, South Carolina. Publication date: 
September, 1966. 480 pages plus FM I-XII, with 69 figures, 7” x 10". Price, $8.50. 


THE C. V. MOSBY COMPANY Publishers 
3207 Washington Boulevard St. Louis, Mo. 63103 


Announcing ... 


LANGUAGE 


AND LANGUAGE 


BEHAVIOR 
ABSTRACTS 


...a new journal of abstracts 
whose aim is to provide 


ACCESS: RAPID, 
SELECTIVE 
AND COMPREHENSIVE 


to the literature of 
language and language behavior 


* whatever the disciplinary focus 


* whatever the country of origin 


* whatever the language in which it is 
written 


aes CHORI Ey of CRLLB AND 
BELC 


Center for Research on Language and 
Language Behavior, University of 
Michigan, Ann Arbor, Michigan. 


Bureau pour l'Enseignement de la Langue 
et de la Civilisation Francaises a l'Etranger, 
Paris, France. 


Subscription price is $22.50 a volume, Each 
volume will contain four quarterly issues. 
Each issue will contain 1,000 abstracts. 


Subscription orders should be sent to: 


AC 


Subscription Manager, LLBA 
Appleton: Century 
Crofts 


Division of Meredith Publishing Co. 
440 Park Ave. South, New York, N.Y. 10016 


Consider this important 
new text... 


CASEBOOK ON 
ETHICAL 
STANDARDS FOR 
PSYCHOLOGISTS 


The application of the 19 principles 
comprising the Code of Ethics for all 
psychologists is presented. Each 
principle is illustrated by actual, but 
disguised, cases involving alleged vio- 
lations of the principle and action 
taken by the Board of Professional 
Affairs. 


Four Appendices state and amplify 
the additional principles applicable 
to specific areas of psychology: 


—Ethical Practices in Industrial Psy- 
chology 


—Automated Test Scoring and Inter- 
pretation Practices 


—Standards for APA Directory 
Listings of Private Practice 


—Guiding Principles for the Humane 
Care and Use of Animals 


The arrangement of the Casebook 
into a general section of cases and 
four additional specific sections in- 
creases its versatility as a reference 
book for individuals and as a text for 
instructional purposes. 


Price: $1.00 


Order From: 
AMERICAN PSYCHOLOGICAL 
ASSOCIATION 


1200 Seventeenth Street, N. W. 
Washington, D. C. 20036 
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BLAISDELL BOOKS in PSYCHOLOGY 


AVAILABLE FOR FALL ADOPTION 
Three Important Adjunct Books for 
Key Courses in Educational Psychology 


EDUCATIONAL 
AND 


PSYCHOLOGICAL 
RENE MEASUREMENT: 
STUDIES IN Contributions to 
PSYCHOLOGY OF | EDUCATIONAL Theory and 
CHILDHOOD AND | PSYCHOLOGY. Practice. 


Edited by Edited by 
Edited by Raymond G. Kuhlen | David A. Payne and 
William J. Meyer — | for courses focusing on Robert McMorris— 
for courses focusing on | the Psychology of for courses focusing 
human development School learning on tests and 
Contains 65 theoretical | Contains 50 papers measurements 
papers and research | dealing with the Contains 54 theoretical 
reports dealing with learning process with — and research papers 
the main features of | Special attention to the | providing a broad 
child and adolescent Serui Pec SERI deuil on test 

and motivationa: levelopment wi 
development factors involved. an emphasis on the 


assessment and _ 
prediction of learning 
outcomes. 


BLAISDELL PUBLISHING COMPANY 
A DIVISION OF GINN AND COMPANY 
275 Wyman Street, Waltham, Massachusetts 02154 


ALLYN AND BACON, INC. 
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Just published 
RICHARD M. JONES, Editor 
Contemporary Educational Psychology Selected Readings 


The Contents: PART THREE: APPLICATIONS. X. Dean M. 
PART ONE: PERSPECTIVES. Introduction: Laux: A New Role for Teachers? XT. Donald 
Richard M. Jones: “Education in Depth” Kingsbury: An Experiment in Higher Educa- 
and “The New Curricula.” I. Loren C. tion. XII. Richard M. Jones: The Role of 
Eiseley: Man: The Lethal Factor. II. Jerome Self-Knowledge in the Educative Process. 
S. Bruner: Education as Social Invention. XIII. Richard M. Jones: Some Educational 
III. Aldous Huxley: Education on the Non- Aspects of Group-Leader Interaction. XIV. 
verbal Level. IV. Lawrence S. Kubie: The Elena Goldstein Werlin: An Experiment in 
Forgotten Man of Education. V. Lawrence Elementary Education. XV. Wilfrid Hamlin: 
S. Kubie: Research in Protecting Pre- Fission and Fusion. TB/1292 $2.75 
conscious Functions in Education. 


PART TWO: SYSTEMS. VI. Howard E. Gruber: 

Education and the Image of Man. VII. 

Ulrich Neisser: The Multiplicity of Thought. Correlate catalog of Harper 
VIII. Jerome S. Bruner: The Course of p Torchbooks [557 volumes now in 
Cognitive Growth. IX. George S. Klein: 1500. BEY. available on request from 
Consciousness in Psychoanalytic Theory: t. 51, HARPER & ROW, 
Some Implications for Current Research in | Publishers, 49 E. 33rd St., New 
Perception. York, N.Y. 10016 


A new revision 
of major significance . . . 


APA PUBLICATION MANUAL 
1967 Revision 


The APA PUBLICATION MANUAL has undergone a complete revision for 
the first time in 10 years. Although the revised edition prescribes the same basic 
format for manuscripts as the previous edition, major changes and additions were 


made in the text in an effort to clarify and simplify the instructions on how manu- 
scripts should be prepared. 


The APA PUBLICATION MANUAL is intended primarily as a guide for 
authors and editors in the preparation of material to be published in APA journals. 
However, in preparing the original edition it was recognized that the MANUAL 
could, and should, be written to accommodate a wider range of manuscripts. The 
revised PUBLICATION MANUAL has retained that original concept. 


April, 1967 $1.50 
Order from: 


AMERICAN PSYCHOLOGICAL ASSOCIATION 
1200 Seventeenth Street, N. W. 
Washington, D. C. 20036 
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To be published in the Fall of 1967 


Otis-Lennon 
Mental Ability Test 


A completely new edition of the well-known 
OTIS QUICK-SCORING MENTAL ABILITY TEST 


In addition to retaining many of the outstanding features of previous editions, 
this entirely new series incorporates further refinements in item content, 
organization of the various levels, and scaling and norming procedures. These 
improvements were specifically designed to guarantee a continuous, carefully 
articulated scale of general mental ability. 

Five Levers of the Otis-Lennon Mental Ability Test have been devised 
for efficient measurement of the range of ability commonly found in grades 
K-13. These are: 


Primary Level (Grades K-1.5) 
Elementary I Level (Grades 1.5-3.9) 
Elementary II Level (Grades 4.0-6.9) 
Intermediate Level (Grades 7.0-9.9) 
Advanced Level (Grades 10.0 and above) 


The two lower levels, designed for Grades K-3, contain pictorial and 
geometric test content sampling the mental processes of classification, follow- 
ing directions, quantitative reasoning, verbal reasoning, verbal eonveptual- 
ization, and reasoning by analogy. No reading is required at these levels. 

The upper three levels contain verbal, symbolic, quantitative, and figural 
test material, sampling fourteen different mental processes, measuring abstract 
reasoning ability. 

Two general types of norms are available: 

Grape Norm Group — Percentile ranks and stanine scores indicate each 
pupil’s standing in the national sample at a given grade level. 

Ace Norm Group — The Deviation IQ Score (DIQ) indicates the upil's 
relative position in the national sample of pupils of a similar chrono! logical 
age. The percentile rank and stanine scores further clarify the meaning of 
the DIQ score. 

For further information on Otis-Lennon Mental Ability Test, write: 


Harcourt, Brace 
& World, Inc. test DEPARTMENT 


757 Third Avenue, New York, N.Y. 10017 
Chicago * San Francisco * Atlanta * Dallas 


Now! A Self-Selection Textbook 


INTRODUCTION TO 
GENERAL PSYCHOLOGY 


Consulting Editor 
Jack Vernon 
Department of Otolaryngology 


The following titles are part of the Wm. C. Brown IN- 
TRODUCTION TO GENERAL PSYCHOLOGY: A SELF- 
. SELECTION TEXTBOOK. Each chapter in this textbook, 
standing alone, affords an introduction to one of the primary 
themes in the tapestry of psychology and may be used as 
supplementary material to a course textbook already in use. 
The chapters which are listed here are those which have direct 
applicability as Supplementary material for the educational 
psychology course. Each chapter is published individually as a 
self-contained unit of approximately 50 pages in length, and 
punched to fit a standard three-ring binder—65 cents each. 


CHAPTERS NOW AVAILABLE 


PHYSIOLOGICAL BASIS OF PSYCHOLOGY by John C. Armington, 
-D., Northeastern U. 
LEARNING by Richard Dolinsky, University of Toledo 
E. Bartoshuk, Brown Universit: 


SYMBOLIC PROCESSES by S. Glucksberg, Princeton University 

SOCIAL PROCESSES by Peter Suedfeld, Rutgers 

CHILD DEVELOPMENT by Charles P. Smith, Brooklyn College 

PERSONALITY bys Robert W. Lundin, The University of the South 
y R. H. Day, Monash University 

EMOTIONS by Jose M. R. Delgado, Yale University 

MEASUREMENT by Leonard M. Horowitz Stanford University 

THE NATURE AND SCOPE OF PSYCHOLOGY by Richard H. Henne- 

man, University of Virginia 


9 Chapters already being used in more than 160 schools 
* Additional titles in preparation 
e Examination copies available upon request 
WM. C. BROWN COMPANY PUBLISHERS 
BOX 539 e DUBUQUE, IOWA 52001 
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THE PSYCHOLOGICAL FOUNDATIONS 
OF EDUCATION SERIES 


General Editor: Victor H. Noll, Michigan State University 


Intensive investigation in the field of educational psychology has resulted 
in the establishment of a wide variety of significant topics that warrant 
independent consideration, This Paperbound series is designed to pro- 
vide a deeper examination of specific areas which are not giventhorough 
treatment in many textbooks now in use. The Noll series offers a con- 
venient source of up to date supplementary material. Theoretical mate- 
rial is an important part of each book, but the emphasis is centered on 
practical applications. 


PSYCHOLOGY OF ADOLESCENCE FOR TEACHERS * 

By Glenn Myers Blair and R. Stewart Jones, both, University of Illinois 

1964, 128 pages, $1.50 

PSYCHOLOGY OF THE CHILD IN THE CLASSROOM 

By Don C. Charles, lowa State University 

1964, 86 pages, $1.25 

THE PSYCHOLOGY OF LEARNING IN THE CLASSROOM 


By Robert C. Craig, Michigan State University. 
1966, 85 pages, $1.25 


THE MENTALLY RETARDED CHILD IN THE CLASS- 
ROOM 


By Marion J. Erickson, Ypsilanti Public Schools 
1965, 114 pages, $1.25 


PROBLEM SOLVING IN THE CLASSROOM 


By Bryce B. Hudgins, Washington University 
1966, 74 pages, $1.25 


TEACHER SELF-EVALUATION 
By Ray H. Simpson, University of Illinois 
1966, 100 pages, $1.25 


GUIDANCE IN THE CLASSROOM 


By Ruth Strang, University of Arizona, and Glyn Morris, Assistant Supervisor of Guidance 
and Curriculum, Lewis County, New York 
1964, 118 pages, $1.50 


GIFTED CHILDREN IN THE CLASSROOM 


By E. Paul Torrance, University of Minnesota 
1965, 102 pages, $1.25 


Write to the faculty service desk for examination copies. 


THE MACMILLAN COMPANY 


866 Third Avenue, New York, N.Y. 10022 


MeGraw-Hill Books 


Utah State University. Assumes that educa- 
psychological principles which can be applied 
comprehensive, integrating both classic and cur- 

rent experiments and theories in learning. An instructor's manual is available. 

PSYCHOMETRIC THEORY 

By JUM C. NUNNALLY, JR, Professor of Psychology, Vanderbilt University. McGraw-Hill 
Series in Psychology. A comprehensive graduate-level text for courses and seminars in 
psychological measurement, The material is intended for the general student, yet there is a 


McGraw-Hill i Book Company 


330 West 42nd Street, New York, New York 10036 


SEEKING A PUBLISHER? 


Book publication is one of the foundation stones of a scholar's 
career in gaining recognition and advancement. Exposition 
Press, under its special academic imprint Exposition Uni- 
versity Books offers you a complete publishing service... 
including editorial supervision, a dynamic marketing and 
promotional program, and professional distribution. 


Your inquiries and manuscripts are invited (all subjects wel- 
come). An editorial report furnished without obligation. 


FREE: Two fact-filled illustrated brochures explain the 
behind-the-scenes story of publishing, describe our first edi- 
tion plan, and present a breakdown of contract terms and 
typical costs. 


Write Dept. 63 


EXPOSITION PRESS 
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CRITIQUE OF CATTELL’S “THEORY OF FLUID AND 
CRYSTALLIZED INTELLIGENCE: A CRITICAL 
EXPERIMENT" 

LLOYD G. HUMPHREYS 


Universit 


In a complexly designed factor 


y of Illinois 


analysis of ability and personality 


variables Cattell found evidence for higher-order factors representing 
his hypothesized fluid and crystallized abilities. This critique demon- 
strates that there are methodological weaknesses at every stage of 
the study from the selection of the variables, through the Ist-order fac- 
toring, to the 2nd-order factoring such that one can not have confi- 
dence in the results, In contrast, a simple 9 X 9 table of intercorrela- 
tions provides results that are in line with Cattell’s theory though 


they are not critical in disti; 


ing his theory from Vernon's hier- 


archical model of abilities. General principles for the design of factor- 
analytic investigations also emerge. 


Cattell (1963) has published the 
results of an investigation designed 
to test certain ideas presented earlier 
(Cattell, 1941) concerning the nature 
of intelligence, The title “Theory -of 
Fluid and Crystallized Intelligence: 
A Critical Experiment” indicates con- 
fidence in the design of the study, 
and the conclusions drawn indicate 
confidence in the results. After ex- 
amination of the design and of the 
Tesults, however, the present writer 
Concluded that he lacked confidence 
in both. The ideas which led to the 
Tesearch are stimulating, but they de- 
Serve a better design and a more ob- 
Jective methodology. 

In the study in question, second- 
order factors, their intercorrelations, 
third-order factors, and their inter- 
Correlations were presented. The fac- 
torings on which the conclusions rest 
are rather far removed from the basic 
data, the intercorrelations of the var- 
_ 

*The writer acknowledges the support of 
the University Research Board of the Uni- 
versity of Illinois in this investigation. Un- 
fortunately, although three random normal 
viates were used in the original study, no 
Correlations or factor loadings of these 
Variables were included in any of the tables 
of basic data available to the writer. 


iables, The techniques of factor anal- 
ysis are not sufficiently objective that 
& reader can accept final rotated mat- 
rices without question. This is par- 
ticularly true of higher-order analy- 
ses. It is essential to have test 
intercorrelations and the first-order 
factoring at hand in order to inter- 
pret adequately, let alone criticize, 
higher-order factoring.? 

The design of the study is complex, 
and probably unnecessarily so. Re- 
sults will be presented later in this 
critique which indicate that a simple, 
straightforward approach provides a 
much less ambiguous basis for inter- 
pretations. This alternative method 
of analysis also has the merit that it 
supports the relationships among the 
ability variables hypothesized by 
Cattell! 


The Variables and Their Intercor- 
relations 

The variables consisted of two 
forms of the Verbal, Spatial, Reason- 
ing, and Number tests of the Thurs- 
tone Primary Mental Abilities series, 


2? All but one of the basic tables of data 
were obtained from ADI. R. B. Cattell 
kindly furnished the missing table. 
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TABLE 1 
CORRELATIONAL DATA FOR QUESTIONNAIRE VARIABLES 
Highest correlation with another variable 
Variable number —_|Form A = Form 
Form A With Form B With 
1 + .36 — .31 + .40 B7 
2 4.28 + .39 Vocabulary + .35 Number 
3 + .30 — .38 9 — .44 All 
4 + .27 + .37 A 14 — .40 B7 
5 + .30 — .88 AT —.29 B7 
6 + .28 -—.28 A 12 — .29 B 10 
7 + -43 — .88 A5 + .40 Bl 
8 + .31 — .43 All + .38 B3 
9 + .87 — .38 A3 — .28 A5 
10 + .01 — .16 A6 — .29 B6 
11 + .42 — .44 B3 — 41 A8 
12 + .08 — .29 A9 T2 B1 
13 + .06 — .31 A4 + .35 B4 
14 + .33 + .41 B3 — .37 B3 
Median + .30 | -375 | | .36 | 


one form of the Word Fluency Test 
of the same series, one form of each 
subtest in the Culture Fair Intelli- 
gence Test, and two forms of each 
test of the Junior-Senior High School 
Personality Questionnaire. The per- 
sonality questionnaire variables were 
included in accordance with Cat- 
tell’s rationale of providing “hyper- 
plane stuff.” The sample consisted of 
277 high-school-age boys and girls. 

There are several items of interest 
in the intercorrelations of these vari- 
ables. These are summarized in Table 
1. It is seen that the median cor- 
relation between alternate forms of 
the personality tests is .30, Three of 
these correlations are not distin- 
guishable from zero. In almost every 
case one form or the other of each 
personality variable has higher cor- 
relations with some other variable 
than it has with its alternate. In 
many cases this is true for both 
forms. 

While it is statistically possible to 
have two uncorrelated variables each 
substantially correlated with the 
same factor, one hardly expects to 


see this in tests that are supposed 
to be alternate forms of each other. 
Furthermore, there should be a rather 
clearcut explanation for the suppres- 
sion effects necessary to account for 
such a situation. Suppression effects 
that are accumulated over a sub- 
stantial number of variables with 
the correlations involved not differ- 
ing appreciably from zero suggest an 
operation of capitalizing on chance 
rather than meaningful and replica- 
ble phenomena. 

One variable (323, which is A 10 
of the personality questionnaire) has 
no correlation with anything greater 
than .16. Its mean correlation with 
40 other variables is +.024 and the 
standard deviation of these correla- 
tions is .073. This is to be compared 
with the standard error of a zero cor- 
relation of .060. This variable is little, 
if anything, more than a random 
normal deviate. Several others are 
fairly close to being the same. Linn 
(1964, 1965) has shown that random 
deviates can and do introduce “noise 
into the system. Far from helping to 
define factors, they obscure the defi- 
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nition of factors. Horn (1966, in 
press) has also shown that factors ex- 
tracted from the intercorrelations of 
random normal deviates can be ro- 
tated to replicate rather closely fac- 
tors obtained from psychological 
data. Placing these two findings to- 
gether permits the deduction that 
hyperplane stuff may serve the func- 
tion of allowing greater opportunity 
to capitalize upon chance by increas- 
ing the ratios of number of factors 
and number of variables to number 
of cases. These ratios were increased 
substantially in this instance. The 
addition of 28 variables from the per- 
sonality questionnaires more than 
tripled the number of variables (from 
13 to 41). The inclusion of these 
variables also was designed to pro- 
vide up to 14 additional factors. 


The First-Order Factoring 


Twenty-two factors, based upon 
the intercorrelations of the 41 sup- 
posedly psychological variables plus 
three explicitly defined random nor- 
mal deviates, were retained for rota- 
tional purposes. This number of fac- 
tors was supported by two separate 
tests indicating 20-24 factors. Other 
tests, however, indicate many fewer 
factors, (a) There is a marked drop 
m the size of the centroid loadings 
between Factors 10 and 11, on the 
one hand, and Factors 12 and 13, 
on the other. (b) The product of the 
largest loadings on the latter two fac- 
tors is only about .04. In comparison 
the standard error of a zero correla- 
tion coefficient is .06. (c) Variable 
23 (A 10), which is hardly more than 
à random normal deviate, has factor 
loadings on all factors that are ap- 
Preciably larger than those of other 
Variables with Factor 12 or any fac- 
tor thereafter. (d) The first instance 
m which parallel forms of the Pri- 
mary Mental Abilities tests depart 
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from each other in sign of factor 
loadings is on Factor 11. On Factor 
17 and thereafter there is always one 
case in which the parallel forms of 
the ability variables differ from each 
other in sign. On the basis of these 
four lines of evidence, 22 factors ap- 
pears to be excessive. i 


The First-Order Rotations 


The evidence for overextraction of 
factors obtained from analysis of the 
unrotated loadings is supported by 
analysis of the rotated factors. There 
are three personality factors (XVIII, 
XX, and XXI), for example, defined 
by the three pairs of alternate forms 
that correlate with each other .01, 
.06, and .08, respectively. There are 
also separate factors for the single 
forms of each subtest of the Culture 
Fair test and for the Fluency test 
from the PMA. The reference vector 
correlations of other variables that 
supposedly convert specific factors 
into nonspecifics simply do not make 
sense. One form of a personality test 
may have a reference vector corre- 
lation greater than .20 with one of 
these specifics, but not its alternate 
form. Some of these correlations are 
relatively large, for example, the ref- 
erence vector correlation of .45 of 
Variable 33 (B 6) on the so-called 
Fluency Factor on which Word Flu- 
ency has a correlation of .39, but it 
still seems reasonable to conclude 
that a specific factor was converted 
into a nonspecific by capitalizing on 
chance. First-order factors were 
needed for these variables for the 
second-order analysis even though no 
provision had been made for them in 
the design of the study and rotations 
were accomplished to achieve this. 

One last line of evidence indicates 
overfactoring and contrived rota- 
tions. In showing that factors from 
random data could be rotated into 
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positions that were highly similar 
with those from psychological data, 
Horn (in press) computed squared 
multiple correlations between each 
rotated factor and all of the remain- 
ing factors. These were quite high 
with most being greater than .90, This 
may be the almost inevitable result 
of the process of capitalizing on 
chance to obtain a theoretical struc- 
ture in the data. In the present study 
the median multiple correlations 
squared for the ability factors is .56 
with a range from .32 to .71. For the 
personality factors the median is .60 
with a range from .35 to .83. These 
latter particularly are higher than 
anticipated from previous work with 
personality tests and approach the 
values Horn obtained for the random 
data factors. 

If we subtract five specific factors 
and three personality factors defined 
by variables essentially uncorrelated 
with each other, we are left with 14 
possibly meaningful factors. This es- 
timate does not, of course, allow for 
the possibility that there may not be 
10 factors remaining in the person- 
ality questionnaire data and is thus 
a maximum figure. 


The Hyperplane Count Criterion 


The preceding analyses of the 
first-order rotations indicates that 
hyperplane count is an inadequate 
criterion of the validity of rotations. 
This inadequacy is brought out more 
clearly by considering two extreme 
cases in which it might be applied. 
A set of intercorrelations showing 
Spearman's hierarchieal order will de- 
fine only one factor with no variable 
having a loading of zero. This factor 
is uniquely determined though there 
is no hyperplane. In a more complex 
table of intercorrelations, on the other 
hand, if an investigator were to ex- 
tract as many factors as variables, 


he would find n — 1 variables in each 
of the hyperplanes. Furthermore, the 
percentage of variables in the hyper- 
plane would increase directly with 
^ in this second case. Whenever in- 
tercorrelations are overfactored, this 
second case is approached and the hy- 
perplane count will be high. The in- 
clusion of five specifics and three 
factors determined by zero correla- 
tions between parallel forms insures 
a high hyperplane count overall in 
the Cattell first-order analysis. 

Hyperplane count is also in- 
creased by starting the analysis with 
many variables of low communality. 
Yet Tucker (1964) has shown that 
size of communality is one of the 
most critical variables in determining 
whether a known factor structure can 
be recovered from correlational ma- 
trices. Under carefully controlled con- 
ditions, including adequate commu- 
nalities of the variables and adequate 
criteria for number of factors, hyper- 
plane count is useful. Otherwise it 
provides a rationale for overfactoring, 
use of variables of low communality, 
and subsequent capitalization on 
chance in rotations. 


The Second-Order Factoring 


Eight factors were extracted in the 
second order. Since too many factors 
were rotated in the first order, and 
since the rotations built up a high 
degree of dependency among the 
first-order factors as a whole, it is 
not surprising to find that the second- 
order factors tend to make a larger 
contribution to variance than many 
of the first-order factors. These fac- 
tors accordingly determine high row 
sums of squares (estimated commu- 
nalities) which are in line with the 
high squared multiple correlations 
among the first-order factors. The 
noniterated communalities have 4 
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median of .61 and a range from .32 
to .89. Four are in the high eighties. 

Of the eight factors extracted, 
probably only the eighth is too small 
to retain for rotations and interpreta- 
tion. Note that this does not mean 
there are at least seven second-order 
factors in any psychological defini- 
tion of the term. Rather, the high 
level dependencies in the first order 
created second-order factors. By the 
same token, the second-order factors 
in these data make several of the 
first-order factors superfluous; that is, 
for all practical purposes the second- 
order factors completely describe sev- 
eral of the first-order factors. 


The Second-Order Rotations 


There are difficulties in the second- 
order rotations that are ignored in 
the discussion of the results. After ro- 
tation there are two factors that are 
not interpreted. Both have substan- 
tial loadings for a number of varia- 
bles. Both make a bigger contribution 
to variance after rotation than do 
most of the unrotated factors. Such 
factors could have provided freedom 
to rotate the remaining six to agree 
with hypotheses. 

Interpretation of factors and fac- 
tor loadings is of the “broad brush” 
variety. Loadings not in line with 
hypotheses are ignored. For example, 
ego strength, with a loading of .21, 
is listed as one of the defining vari- 
ables for the fluid-ability factor, but 
parmia, also with a loading of 21, 
18 not mentioned. In interpreting the 
"rystallized-ability factor, on the 
other hand, no mention is made of 
Cyelothymia or of excitability with 
loadings of .52 and —.44, respec- 
tively, Similarly, the loading of —.52 
of the supposed first-order fluency 
factor on the second-order extrover- 
Sion factor is not discussed. The as- 
Sertion that there is an interaction 


between the control factor and crys- 
tallized ability apparently depends on 
a correlation between the factors of 
.28 whereas the corresponding corre- 
lation with fluid ability is .21. Even 
if factor interpretation were viewed 
as hypothesis formation only, rather 
than hypothesis testing, such selec- 
tivity could not be defended. 


An Alternative Analysis 


Cattell structured his analysis of 
the data as a series of factorings in 
several orders in order to avoid “dis- 
tortions from the unreliabilities and 
imperfect validities of actual scales 
[1953, p. 13]. This is theoretically 
sound, but depends in practice upon 
an objective factor-analytic method- 
ology. It is useful to look at a simpler 
approach and to gauge the extent to 
which unreliability and imperfect 
validity mask relationships versus the 
extent to which a simpler design re- 
duces computations and interpreta- 
tional difficulties. 

The present writer computed inter- 
correlations between sums of the par- 
allel forms of the ability variables 
and the single-form ability tests. The 
result was a nine-variable correla- 
tional matrix which is presented as 
Table 2. One does not need to factor 
this matrix in order to test the hy- 
pothesis that the Cattell tests (Vari- 
ables 6-9) and the Spatial test will 
separate from the remaining Pri- 
mary Mental Ability tests. Appropri- 
ate median correlations are as 
follows: among the PMA without 
Spatial Relations, 44; among the 
Cattell tests and Spatial Relations, 
.305; between the two clusters, 255. 

A factoring is, of course, readily 
accomplished. Communalities were 
first iterated for each of two, three, 
and four factors. Three sets of factor 
loadings were then computed for 
two, three, and four factors using the 
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TABLE 2 
INTERCORRELATIONS OF NINE ABILITY VARIABLES 

Variable 1 2 3 4 5 6 7 8 9 
1. PMA-Verbal 33 46 37 46 31 23 34 | 25 
2. PMA-Spatial 33 26 20 21 29 32 28 | 29 
3. PMA-Reasoning 46 26 45 37 28 27 38 24 
4. PMA-Numerical 37 20 45 43 20 17 28 17 
5. PMA-Fluency 46 21 37 | 43 31 16 30 | 18 
6. Series 31 29 28 20 31 33 45 29 
7. Classification 23 32 27 17 16 33 44 28 
8. Matrices 34 28 38 28 30 45 44 33 
9. Topology 25 29 24 17 13 29 28 33 


Note.—All decimals omitted. 


appropriate stabilized communalities. 
The roots for these three analyses are 
shown in Table 3. Loadings on two, 
three, and four factors are shown in 
"Table 4. 

"There is always room for argument 
among factor analysts about the 
proper number of factors to rotate. 
The present critic is very willing to 
accept two factors, although there is 
obviously a large general factor in 
these data. It would also seem neces- 
sary, if one goes beyond two, to re- 
quire four since there is so little 
difference between the third and 
fourth. The latter factors, in any 
event, do not constitute very impor- 
tant contributions to variance even if 
replicable from the sampling point of 
view. A plot of two factors, which is 
80 easy to visualize as not to require 
a figure, confirms the separation of 
the Primary Mental Ability tests 
other than the Spatial Test from the 
Culture Fair tests. Furthermore, if 
one uses the same criteria for rota- 
tion as Cattell used in his second- 
order analysis; that is, allowing neg- 
ligible negative loadings of the ability 
tests on ability factors, one obtains 
a correlation between the present 
first-order factors of .44. By allowing 
somewhat larger negative values, a 
better fit is obtained in the present 
nine-variable plot. This alternative 


rotation produces a correlation be- 
tween the factors of .57. These cor- 
relations are to be compared with a 
correlation reported by Cattell of .47. 
One concludes that there is little evi- 
dence for the effects of unreliability 
and imperfect validities that Cattell 
feared. In addition, the hypothesized 
result is reached with less expense, 
less labor, and with much greater con- 
fidence. 

The congruence of observations 
with hypothesis does not rule out 
other theoretical possibilities. Vernon 
(1950) would have predicted the 
same result on the basis of his in- 
terpretations of hierarchical factors. 
On the basis of these data, no matter 
how analyzed, the two factors might 


TABLE 3 


Roots OBTAINED AFTER IrERATING COM- 
MUNALITIES FOR Two, THREE, AND FOUR 


FACTORS 

Number of factors in iterations 
Factor number 
2 3 4 

I 2.83 2.87 2.90 
II -59 .62 .05 
IH .13 .22 22 
XV .12 13 .19 
"V .02 .04 .06 
VI —.00 .02 .08 
VII —.08 .00 
VIII —.04 
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TABLE 4 
Two-, THREE-, AND Four-Facror SOLUTIONS FOR THE ÅBILITY VARIABLES 
Two Three Four 
Variable 

I II Li I I II h T I Il IV Li 
1 64 | —20| 45 65 | —22 | —14 | 49 64 | —20| —13 10| 48 
2 48 15| 25 50 17| —32 | 38 49 17| —29 14| 37 
3 63 | —19| 43 63 | —19 02| 43 64 | —19 | —08 | —24 | 51 
4 54 | —34 | 40 54 | —34 05| 41 54 | —33 00 | —15 | 42 
5 57 | —33 | 43 57 | —34 06 | 44 59 | —38 15 21) 55 
6 56 20| 35 55 19 06] 35 56 21 15 16| 41 
z 51 35| 38 50 33 01| 36 50 34 | —01 | —06 | 37 
8 65 24| 48 67 27 27 | 60 66 27 24 | —11 | 58 
9 44 24| 25 44 23 | —11| 26 44 24 | —12 | —01 | 26 


Note.—All decimals omitted. 


as well be called intellectual-educa- 
tional, rather than crystallized, and 
spatial-practical, rather than fluid, 
The correlation between the two fac- 
tors determines a higher-order factor 
also, which Vernon would call general 
intelligence. Cattell’s discussion of 
the higher-order factor, on the other 
hand, is ambiguous. He calls both 
fluid and crystallized abilities general 
abilities, when they are obviously not 
general by any accepted definition of 
the term. The only general-ability 
factor by the usual definition is the 
Second-order factor in the writer’s 
analysis and the third-order factor in 
Cattell's analysis, The general factor 
is determined in both analyses by the 
Correlation between the fluid- and 
crystallized-ability factors, 

Ore recent work by Horn and 
Cattell (1966), appearing since the 
above was written, is based upon a 
Simpler design; the factors with 
Which the authors are concerned are 
in the first rather than in the second 
Order. Since factor score estimates 
Were intereorrelated, however, these 
first-order factors are the equivalent 
of the usual second-order factors. 

ere is still one debatable meth- 
Cdological issue; the present writer 
Would have rotated and interpreted 


a smaller number of factors, As a 
consequence of their choice of number 
of factors, they describe four general- 
ability factors, or in the more usual 
terms four broad group or second- 
order ability factors, rather than two. 
These are G, (visualization) and G, 
(speed) in addition to the earlier 
fluid and crystallized abilities, 

The separation in the second order 
of the several first-order visualization 
factors from the fluid-ability factors 
is unexpected from Vernon’s (1950) 
position, but the results must be con- 
sidered highly tentative. The authors 
depended upon a commonly misused 
criterion for the number of factors 
that Horn had previously (1965) and 
cogently criticized and for which he 
had recommended a correction. Even 
cursory inspection reveals the pres- 
ence of surplus factors. The variables 
with the highest estimated commu- 
nalities are among those with the 
lowest levels of correlations with the 
remaining variables. Two rotated fac- 
tors are specifics. A third factor is 
defined by four variables whose in- 
tercorrelations have a mean absolute 
value of .135. 

Reanalysis of the  Horn-Cattell 
data, omitting “noisy” variables 24 
and 31 whose correlations with other 
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variables are little more than ran- 
dom, and iterating communalities for 
four, five, and six factors, indicates 
that four factors should be retained 
for rotational purposes. One of these 
is an abstract and visual reasoning 
factor which combines the Horn- 
Cattell fluid and visualization abili- 
ties. A second is generally comparable 
to their crystallized-ability factor. A 
third is most nearly comparable to 
their general speed or test motivation 
factor, but with the fluency variables 
added; intellectual speed might be a 
suitable name. The fourth factor in 
this solution is a personality factor 
similar to the Horn-Cattell PSI. 

If a fifth factor is retained for ro- 
tations, fluid and visualization factors 
very similar to the ones described by 
Horn and Cattell emerge. There is no 
suggestion, as in the earlier study, 
that rotations were determined by the 
author’s expectations. If five factors 
can be supported, then the separation 
of second-order visualization and 
fluid abilities can be supported, In 
addition to the fact that other criteria 
indicated the tenuous nature of the 
fifth factor, however, the estimated 
communality of Variable 29 climbs 
rapidly in 50 iterations to the un- 
reasonably high figure of .93 and is 
still changing by .01. Communality 
estimates that are psychologically 
highly improbable, even though they 
do not exceed unity, suggest that 
an error has been made in selecting 
the number of factors. In contrast, 
estimated communalities stabilize 
quite quickly and reasonably in a 
four-factor solution. 

One matter is somewhat clarified in 
the recent article (Horn & Cattell, 
1966). The mechanical information 
factor appears to be more closely re- 
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lated to verbal-intellectual factors 
than to the spatial-practical as 
Vernon (1950) would expect. This is 
seen both in the published intercor- 
relations and in the writer's four- 
factor solution. Both of these bases 
for inference, however, show that the 
differences are not large. The writer's 
hunch is that neither the Vernon 
hierarchy nor the Cattell theory will 
accurately describe the relationships 
of the mechanical information factor 
to other factors when more depend- 
able data become available. 
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COMPARABILITY OF GROUP TELEVISION AND 
INDIVIDUAL ADMINISTRATION OF THE PEABODY 
PICTURE VOCABULARY TEST: 
IMPLICATIONS FOR SCREENING 


GEORGE A. FARGO, DORIS C. CROWELL, MARY H. NOYES, ROBERT Y. 
FUCHIGAMI, JOHN M. GORDON, anp PETER DUNN-RANKIN 
University of Hawaii 


In a pilot study designed to test the efficiency of administering the 
Peabody Picture Vocabulary Test (PPVT) by means of educational 
television, 126 3rd- to 5th-grade children at the University of Hawaii 
Elementary School participated in a counterbalanced design study. All 
children were administered the PPVT individually and in a TV group 
session with ¥2 Ss having the individual test first and V4 the group 
test first; Forms A and B were similarly alternated for order and type 
of administration. An analysis of variance of the raw scores indicates 
the test results for individual and group TV administration are com- 
parable. Further studies using a broader sampling group would dem- 
onstrate the feasibility of the use of educational TV as a screening 


medium, 


This study was conducted to ex- 
amine the feasibility of adapting the 
Peabody Picture Vocabulary Test 
(PPVT, Dunn, 1959) for group ad- 
ministration by means of educational 
television. The objective was to test 
the hypothesis that scores obtained 
In group TV administration would not 
differ significantly from those obtained 
In individual administration. The in- 
Vestigators believed that if the two 
Administrations were found to be com- 
parable, the economical group admin- 
istration could be used to screen chil- 
dren and identify those who need 
further individual study. 

The Peabody Picture Vocabulary 
Test was designed to measure verbal 
Intelligence through measuring recep- 
tive vocabulary. The test provides 
Standard score equivalents from age 
2 years 3 months to 18 years 5 months 
for mental age, intelligent quotient, 
and percentile. The two forms A and B 
Were standardized on over 4000 sub- 
Jects (Ss) and have been adequately 
demonstrated to be equivalent (Dunn, 
1965). 


RELATED STUDIES 


Norris, Hottel, and Brooks (1960) 
explored the group administration of 
the PPVT utilizing photographic slides 
projected onto a screen with an ex- 
aminer providing the verbal cues. 
Their study demonstrated that the 
average scores of fifth-grade children 
of normal intelligence and in proper 
school grade for their age were not a 
function of the form or method of ad- 
ministration, that is, individual or 
group. Tempero and Ivanoff (1960) 
administered both forms of the PPVT 
to 150 seventh-grade students using an 
opaque projector and obtained a corre- 
lation of .75, which demonstrated the 
equivalence of IQ scores on Forms A 
and B. In addition, Dunn (1965, Ta- 
ble 1) cited nine other studies which 
support the equivalence of the two 
forms. 

Although Norris and Tempero ad- 
ministered the PPVT to groups, they 
did not utilize TV administration. A 
similar procedure was employed by 
Curtis, King, and Kropp (1963) as 
part of a television testing study. Sup- 
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ported by the statements of Cronbach 
(1960) and Cardell (1962) indicating 
the value of and need for group 
screening, the investigators explored 
the possibility of television as a 
Screening device. 


Mertxop 


Experimental Design 


A counterbalanced Treatment X Ss de- 
sign (Lindquist, 1953) was utilized. Half the 
Ss had the individual presentation prior to 
the group video tape presentation and half 
had the group session first. There was no sig- 
nificant difference in the mean IQ's of the 
groups as measured by other individual in- 
telligence tests. Parallel PPVT Forms A and 
B were used in both administrations. Table 1 
presents the plan of administration. 

The Treatment X S design increased the 
precision of the experiment by eliminating 
inter-S differences as an error factor. In ad- 
dition, the design enabled the investigators 
to identify the effects of learning on the re- 
test. Since the form and order of administra- 
tion (group or individual) were rotated for 
all Ss any systematic increase in test scores 
obtained during the second presentation 
could be attributed to learning. 


Subjects 


One hundred thirty-five children were 
selected from the University of Hawaii Ele- 
mentary School. All third, fourth, and fifth 
grades participated in the study. Subse- 
quently, the number of Ss was reduced to 


TABLE 1 


DESIGN FOR TEST ÀÁDMINISTRATION 
m—————————À—— 


Administration Individual TV 
First 
Form A Group 1 Group 3 
N=2 N = 30 
Form B Group 2 Group 4 
N = 34 N = 38 
Second 
Form A Group 4 Group 2 
N = 33 N = 34 
Form B Group 3 Group 1 
N = 30 N=% 


Farco, CRowELL, Noyes, FUcHIGAMI, GORDON, AND DUNN-RANKIN 


126; nine Ss were eliminated as they were 
unavailable for the total testing sequence. 

Individually administered standardized 
intelligence tests placed these children within 
an IQ range of 91-152 (with a mean of 123). 
It is apparent that this was a highly selected 
group of children. Although all ethnic back- 
grounds found in the state of Hawaii were 
represented, every child was English speak- 
ing. 


Procedure 


Two video tapes, one each for Forms A 
and B were made with modifications for use 
with IBM answer sheets (IBM form I.T.S, 
1000A 309). An experienced psychological 
examiner administered the PPVT on both 
tapes. Two testing rooms were set up for 
simultaneous viewing of the TV tapes, Each 
room contained two television monitors (18 
viewing positions to each monitor). 

Prior to the TV presentation the teacher 
announced briefly that the students were 
going to take a test on TV and she asked 
only that the students write their names on 
the answer sheets. The original directions 
from the manual were adapted for group TV 
administration. The adaptation included an 
orientation to the task and an orientation to 
the use of the answer sheet. The total elapsed 
tape time for introduction, instructional pro- 
cedures, and the test items was 30 minutes. 

Plates 40 through 120 were included. 
"These limits were defined by the tenth per- 
centile for the third graders (or equivalent 
age) to the 90th percentile for the fifth 
graders (or equivalent age) as estimated 
from the norms cited in the manual. After 
the plate number, each word was presented 
twice. Items 40 through 59 were paced at 10- 
second intervals, 60-89 at 12-second inter- 
vals, and 90-120 at 15-second intervals. At 
specified intervals throughout the tape, the 
examiner reminded the children to look at 
all four pietures and pointed out the in- 
creasing difficulty of the items, encouraging 
them to respond as best they could. The 
Ss marked their responses on the IBM an- 
swer sheets. eye 

Individual presentations were adminis- 
tered to each S as previously described fol- 
lowing the PPVT standardized procedure 
(Dunn, 1965). 


RESULTS AND CONCLUSION 


An analysis of variance of the scores 
obtained on the 126 individual and 
group TV presentations yielded a Be- 
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tween Administration F ratio of .75. 
Since the F ratio is less than the criti- 
cal value, the variation in the data 
may be attributed to chance and the 
null hypothesis was not rejected. 

Three ¢ tests were applied to the 
scores comparing performance on In- 
dividual Administration versus Group 
Administration, scores on Form A 
versus Form B, and scores on First 
Administration versus Second Ad- 
ministration. In no case did the dif- 
ferences in means reach significance 
(Table 2). These findings confirm 
those of Norris et al. that on retest a 
one-point gain may be attributed to a 
practice effect. The F test showed no 
significant difference at the .01 level 
in variance when scores were com- 
pared by Form or Order of Admin- 
istration but differences slightly ex- 
ceeded significance when Individual 
and Group Administration scores 
were compared. 


Discussion 


Although Ss used represent a rather 
select group in terms of intellectual 
ability, they function over a wide 
enough range to support the appropri- 
ateness of TV as a medium for group 
Screening. The apparent comparabil- 
ity in scores obtained under the two 
types of test administration points to 
the feasibility of the use of the TV ad- 
ministration of the PPVT for group 
testing. The greater variability in the 

group administration can probably 

e minimized by more careful moni- 

toring of the testing situation and 

Improving the viewing conditions with 

Spaced desk placement and optimal 
lighting for the TV screen. 

“flere are several important impli- 
oe in group TV test administra- 
jon: 


` 1. TV screening is more economical 
in time and personnel. The cost in time 


TABLE 2 
PPVT PrnrogMANCE or 126 THIRD-, 
FOURTH-, AND FIFTH-GRADE STUDENTS 


(Raw Sconzs) 
Form NE Order 
VIS EE Cip] ist | ana 
M 82.91/82. 67/83. 11/82. 47/82, 5383.06 
SD 8.97/10. 89} 8.77/11.04| 9.91] 9.95 
is 0.23 .060. .044 
Fh 1.47* 1.59** 1.009 
^N = 120. 
^ df = 125, 125 
*p < .05. 
**» < 01 


of testing 100 children on the PPVT 
individually would be 25 examiner 
hours. Using TV administration the 
same task can be accomplished with 
2% man-hours, 

2. The class is not repeatedly inter- 
rupted by the removal of each child 
and the test may be taken in the class- 
room. 

3. The most proficient and effective 
test administrator with the best enun- 
ciation can be selected. 

4. A taped presentation provides 
consistency and high reliability be- 
tween repeated administrations. 

5. The screening test can be pre- 
sented simultaneously throughout the 
school district to obtain longitudinal 
results that may be used for inter- 
school comparisons. 

6. The use of IBM answer sheets 
adds speed and accuracy in scoring, in 
addition to reducing the per-pupil cost 
of testing and scoring. 

Educational television has been 
used successfully for teaching and 
demonstration purposes. The present 
pilot study demonstrates its use can 
be extended to evaluation as a group- 
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screening medium. An appropriate 
next step would be a replication of 
this study with a broader population 
sample. 
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DIFFERENTIAL PERSONALITY DEVELOPMENT IN 
YOUNG ADULTS OF MARKEDLY DIFFERENT 


No. 3, 141-152 


APTITUDE LEVELS: 


WALTER T. PLANT am EDWARD W. MINIUM 
San Jose State College 


Differential personality changes for high- and low-aptitude groups were 
studied. Personality test and retest data were utilized from 5 different 
longitudinal studies; 3 studies were over a 2-yr. time period and 2 were 
over 4 yrs. Personality changes over 2 and 4 yrs. were studied for young 
adult males and females separately. 40 analyses of covariance were re- 
ported with retest means adjusted for initial test score. There was sub- 
stantial evidence for: (a) young adults of higher aptitude to exhibit 
more personality change over time and in the direction of the trend of 
college students in general, and (b) young adults of higher aptitude to 
exhibit more “psychologically positive” personality development over 
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time. 


Contrary to the widely read and 
quoted conclusions of Jacob (1957), 
all reported studies of personality 
change in college students during col- 
lege indicate that: measurable changes 
occur. From the earliest studies 
(Corey, 1936, 1940; Kuhlen, 1941; 
Newcomb, 1943) to the last published 
study (Lehmann, Sinha, & Hartnett, 
1966), and reviews on the topic 
(Bloom & Webster, 1960; McCullers 
& Plant, 1964; Webster, Freedman, & 
Heist, 1962), changes in attitude, in- 
terest, personality, and value test 
Scores have been reported for college 
students during their college years. 
With respect to certain of the varia- 
bles with which the present study is 
concerned, it has been consistently re- 
ported from institutions as different 
38 Michigan State University (Leh- 
mann, 1963; Lehmann & Dressel, 
1962, 1963; Lehmann et al, 1966), 
San Jose State College (Plant, 19582, 
1958b, 1962, 1965) , University of 

"Research conducted under Contract S- 

; Cooperative Research Branch, United 
States Office of Education, Department of 
Health, Education and Welfare. Appreci- 
ation is herein expressed to Robert B. Clarke 
and Allan Pratt for statistical consultation 
and computer analyses respectively. 


Santa Clara (Foster, Stanek, & Kras- 
sowski, 1961), Vassar College (Freed- 
man, 1960; Webster, 1958; Webster 
et al., 1962), and six California junior 
colleges (Telford & Plant, 1963) that: 

1. Freshman students have signifi- 
cantly higher Authoritarianism, Dog- 
matism, or Ethnocentrism scale means 
than do upperclass or senior students; 
and 

2. Freshman students have signifi- 
cantly higher Authoritarianism, Dog- 
matism, or Ethnocentrism scale means 
than they do when retested at the ena 
of their sophomore, junior, or senior 
years. 

The interpretation of these reported 
results has generally been that the col- 
lege experience has been a “liberaliz- 
ing" one in terms of personality devel- 
opment. Either implicitly or explicitly, 
the reported personality changes have 
been attributed to the college impact 
upon student personality. 

There have been only four studies 
published to date on personality 
changes associated with college at- 
tendance in which comparison groups 
have been included. In these studies, 
groups of subjects (Ss) were studied 
over a 2-year period (Plant, 1958b, 
1962, 1965; Telford & Plant, 1963) 
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and over a 4-year period (Lehmann 
& Dressel, 1963; Plant, 1962, 1965). 
Comparison was made between Ss in 
college attendance throughout the 
study period and Ss with no college 
experience beyond matriculation or 
some amount less than the study pe- 
riod. In the several studies changes 
were found to be similar for both types 
of Ss. These 2- and 4-year results have 
ealled the whole literature dealing with 
the effects of a collegiate experience 
upon student personality into question. 

It may very well be that what in- 
vestigators have been reporting as 
changes in personality characteristics 
resulting from college attendance are 
developmental changes in personality 
characteristics for bright young adults 
irrespective of their higher educational 
attainment during a given period of 
time. Longitudinal investigations of 
the subsequent intellectual develop- 
ment of initially brighter-than-aver- 
age Ss indicate that changes continue 
well into adulthood (Bayley, 1955, 
1957; Bayley & Oden, 1955; Owens, 
1953) and to an extent not character- 
istic of less talented Ss. The current 
study attempted to determine if non- 
intellectual characteristics change over 
time for brighter-than-average Ss in 
the same fashion. 

The pool from which the study 
groups were obtained was presumed 
to be comprised of brighter-than-aver- 
age Ss. All Ss applied for admission to 
a college, were admitted, and the ma- 
jority attended college. Evidence rele- 
vant to support of the presumption 
has been reviewed by  Cronbach 
(1960), and reported in local studies 
(Conry & Plant, 1965; Plant & Lynd, 
1959; Plant & Richardson, 1958). 


METHOD 


All data needed for the current studies 
were available from three separate longi- 
tudinal research projects dealing with “col- 
lege impact,” each of which incorporated 
comparison groups as a part of the research 
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design. There were five separate studies un- 
dertaken: three with a 2-year time interval 
between testing and retesting with the non- 
intellectual measures, and two with a 4-year 
interval. A description of the samples, in- 
struments used, and statistical analyses per- 
formed is found below for each of the five 
studies. Because statistically significant sex 
differences were found for the study 
variables, all analyses were made separately 
for males and females. For purposes of the 
present study, no differentiation was made 
between Ss who were enrolled for the en- 
tire period under study and those who were 
in attendance for a lesser time. 


Original Samples and Instruments 
Used 


Brief descriptions of the total samples 
from which the current study samples were 
obtained are presented below. Abbreviations 
for the several scales which were used follow 
the full names of the tests, and will be used 
subsequently in reporting the results. 

Study 1. Two-year interval between test- 
ing and retesting. Original source of data: 
Plant (1958b). 

A group of 755 Ss (356 males and 399 
females) was tested in 1953 with a 32-item 
California Ethnocentrism Scale (Adorno, 
Frenkel-Brunswik, Levinson, & Sanford, 
1950), or E Scale, and the 1949 form of the 
American Council on Education Psycho- 
logical Examination for college freshmen or 
ACE as a part of the college prematricula- 
tion testing program at San Jose State Col- 
lege. The Ss were retested in 1955 with the 
E Scale. 

Study 2. Two-year interval between test- 
ing and retesting. Original source of data: 
Plant (1962, 1965). 

A group of 1,440 Ss (600 males and 840 
females) was tested in 1958 with a 30-item 
California Ethnocentrism Scale (Adorno 
et al., 1950) or E Scale, the Dogmatism Scale 
(Rokeach, 1956) or D Scale, the Gough re- 
vision of the Authoritarianism Scale (Gough, 
1951) or F Scale, and the 1949 form of the 
ACE as a part of the college prematricula- 
tion testing program at San Jose State Col- 
lege. The Ss were retested in 1960 with the 
E, D, and F Scales. 

Study 3. Two-year interval between test- 
ing and retesting. Original source of data: 
Telford and Plant (1963). 

A group of 1,678 Ss (926 males and 752 
females) was tested in 1960 with the Dog- 
matism Scale (Rokeach, 1956) or D Scale, 
Allport-Vernon-Lindzey Study of Values 
(Allport, Vernon, & Lindzey, 1960) or AVL, 
the Sociability: Sy, Self-Control: S6 
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Achievement via Independence: Ai, Intel- 
lectual Efficiency: Ie, and Responsibility: 
Re scales from the California Psychological 
Inventory or CPI (Gough, 1957), and Form 
A of the School and College Ability Test or 
SCAT as a part of the prematriculation test- 
ing program in six public junior colleges in 
California: Coalinga, Contra Costa, Foot- 
hill, Hartnell, Monterey Peninsula, and 
San Jose City colleges. The Ss were retested 
with the D Scale, the AVL, and the five 
scales from the CPI in 1962. 

Study 4. Four-year interval between test- 
ing and retesting. Original source of data: 
Plant (19582). 

A group of 271 Ss (137 males and 134 
females) was tested in 1953 with a 32-item 
Ethnocentrism Scale (Adorno et al., 1950) or 
E Scale, and the 1949 form of the ACE as a 
part of the college prematriculation testing 
program at San Jose State College. The Ss 
were retested in 1957 with the E Scale after 
they had finished college. This was a special 
study in which no attempt was made to re- 
lest original Ss who failed, for any reason, 
to spe the B.A. degree in the 4-year pe- 
riod. 

S Study 5. Four-year interval between test- 
ing and retesting. Original source of data: 
Plant (1962, 1965). 

A group of 1,032 Ss (448 males and 584 
females) was tested in 1958 with a 30-item 
California Ethnocentrism Scale (Adorno 
et al., 1950) or E Scale, the Dogmatism Scale 
(Rokeach, 1956) or D Scale, the Gough re- 
vision of the Authoritarianism Scale (Gough, 
1951) or F Scale, and the 1949 form of the 
ACE as a part of the college prematricula- 
tion testing program at San Jose State Col- 
lege. The Ss were retested in 1962 with the 
E, D, and F Scales. 

, Jt was felt that five separate studies which 
differed somewhat in the variables utilized, 
in the types of Ss studied, and in the time 
Periods studied would enhance the general- 
ity of findings regarding change in person- 
ality as related to ability. 


Statistical Analyses Performed 


Within each subgroup of males and of 
females in each of the five studies, those Ss 
were selected whose ACE or SCAT total 
raw scores were in the upper quarter or in 
the lower quarter of the distribution for the 
Intact subgroup. The resulting subgroups 
Will henceforth be referred to as “high-apti- 
tude” and “low-aptitude” groups. These 
groups, so constituted, became the object of 

he present analyses, 

To examine the possibility of differential 
changes in nonintellectual scores over the 


TABLE 1 
SAMPLE SIZE AND APTITUDE SCORES ror 
ORIGINAL Grovrs rrom WHICH 
Susszcts WERE SELECTED FOR 


Srupy 
Study number 
Subjects | , 2 3 4 5 
ACE | ACE | SCAT | ACE | ACE 
total | total | total | total | total 
Males 
N 356 | 600 | 926 | 137 | 448 
M 103.2| 108.5| 63.7| 106.8] 109.3 
SD 23.0) 21.1| 17.0| 21.6| 21.5 
Females 
N 399 | 840 | 752 | 184 | 584 
M 100.2| 106.4} 58.2} 103.3| 107.0 
SD 23.3) 21.1| 17.4| 18.4| 21.1 


2- and 4-year periods as a function of level 
of aptitude, analyses of covariance were per- 
formed. Initial scores on each nonintellectual 
variable were used to adjust final scores; 
the tests of significance were therefore made 
on the adjusted retest means between the 
high- and low-aptitude groups. Summed 
across the five studies one finds 20 retest 
variables, and since the analyses were per- 
formed separately for male and female 
groups, 40 such tests resulted, 


High- versus Low-Aptitude Groups 
Studied 


The number of cases in the original sam- 
ples from which Ss were selected, and data 
relative to academic aptitude scores of these 
groups, are presented in Table 1. In the 
high-and low-aptitude groups as constituted 
for the present study, the number of cases in 
each approximated one-quarter of the size of 
each group reported in Table 1. Occasionally 
the number is less since in the subsequent 
analyses it was discovered that for some 
variables in some studies, no retest score 
was available on one or more of the nonin- 
tellectual measures. In approximately half of 
the groups, no scores were missing, In the 
remainder, from one to 15 scores were miss- 
ing, and these Ss were necessarily eliminated 
from the present study. 


RESULTS 
Preliminary Considerations 


The data describing performance on 
the nonintellectual variables for the 
high- and low-aptitude subgroups are 
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TABLE 2 


Comparisons or PERSONALITY TEST SCORE CHANGES over Two YEARS FOR 
Hrien- AND Low-Aptrrupe MaLEs: STUDIES 1, 2, AND 3 


Test variable Aptituda Mx* My> | SDy | Myadi |Acoi| Fy® N 
Study 1 High | 85.5 | 77.4 | 22.2 | 79.9 89 
E Scale - 1.61 
Low 93.9 | 86.2 | 26.0 | 83.6 89 
Study 2 High | 153.0 | 143.3 | 24.7 | 147.5 150 
D Scale - .14 
Low | 166.8 | 152.7 | 26.1 | 148.5 150 
High | 80.6 | 73.5 | 220 | 78.1 150 
E Seale — 3.45 
Low 93.9 | 86.7 | 22.8 | 82.1 149 
High | 117.3 | 101.5 | 21.7 | 105.9 150 
F Scale — 9.73** 
Low | 131.9 | 117.3 | 21.6 | 112.9 149 
Study 3 High | 24.4 | 25.0 | 5.2] 245 231 
CPI: Sy + wl 
Low 22.7 | 23.9 | 5.0 | 24.4 229 
High | 2.6 | 26.8 | 7.6 | 26.8 231 
CPI: Se - .49 
Low 25.7 | 27.2 | 8.2] 27.2 229 
High 19.6 | 21.6 | 4.2 | 20.1 231 
CPI: Ai + | 20.09** 
Low 14.7 | 16.8| 4.8 | 18.3 229 
High | 38.1 | 39.5 | 5.3 | 37.9 231 
CPI: Ie + 9.90** 
Low 31.3 | 34.7 | 5.8 | 36.3 229 
High | 30.3 | 31.3 | 4.7 | 30.1 231 
CPI: Re + | 5.05* 
Low 26.6 | 28.2 | 4.8 | 29.3 229 
High | 46.1 | 46.5 | 7.4 | 46.0 231 
AVL: Theor + | 26.33** 
Low 43.8 | 42.5 | 6.1 | 43.0 229 
High | 40.3 | 39.8 | 8.7 | 40.2 228 
AVL: Econ — 11.37** 
Low 42.3 | 43.1 | 7.4 | 42.5 216 
High | 36.6 | 38.9 | 10.1 | 38.3 228 
AVL: Aesth + 13.18** 
Low | 34.4 | 35.0 | 7.2] 35.7 216 


DIFFERENTIAL PERSONALITY DEVELOPMENT 145 


TABLE 2—Continued 


" Apti 
Test variable ptitude | arya My | SDy | My adja covi| rye N 

High | 34.6 j 

AN g 35.5 | T.1| 359 d 
Low | 36.2] 37.5 | 62] 37.1 ; 216 
High | 42.8 | 42. 3 

AVL: Polit E ab Ba fos s A HAM 
Low | 43.0] 43[2| 6.2 | 431 i 216 
High | 38.3 | 36.6 | 10. : 

AVL: Relig E RA vae | 
Low | 39.0 | 39.1] 8.5 | 389 216 
High | 152.8 | 143.8 | 27.8 | 147.8 228 

D Scale - 19.89** 
Low | 172.6 | 162.7 | 27.8 | 158.7 224 


^ Initial mean test score. 
^ Retest mean score. 


 Retest mean adjusted by covariance procedures for initial test score. 
2 Direction of change in covariance analysis; + = high-aptitude group had higher My 
adj., and — = high-aptitude group had lower My adj. compared with that of low aptitude 


group. 


* F ratio obtained in testing for homogeneity of adjusted means. 


*p= 05. 

9*9: 401: 
presented in Tables 2, 3, and 4. Con- 
sideration of the entries in the columns 
Mx (mean of initial test score) and 
My (mean of final test score) yields 
Several observations relative to pro- 
cedural matters. First, there were ap- 
Parent differences between means on 
many variables for males and females. 
Therefore, the separate treatment of 
the two sexes appears sound. Second, 
there were apparent differences be- 
tween initial means on many variables 
for high- and low-aptitude groups. 
his is as expected in view of the many 
Teports of modest but significant cor- 
Telations between aptitude and non- 
intellectual variables among Ss of the 
type employed, and makes the decision 
© use covariance procedures in the 
study appear sound. Third, it is ap- 
Parent that there often were substan- 
tial differences between initial level 
of performance and that of 2 or 4 
years later. Whereas these considera- 


tions are pertinent to the design of the 
study, they are not the object of the 
present investigation, Consequently, 
no statistical evidence for these con- 
siderations is offered here. The inter- 
ested reader is referred to earlier stud- 
ies cited where these dimensions have 
been of direct interest. 


Comparison of the Two Aptitude 
Groups 

Final means, when adjusted by 
analysis of covariance procedures for 
initial performance, appear in Tables 
2, 3, and 4 under the column headed 
Myaaj. For convenience, the adjacent 
column, headed Acv, shows the rela- 
tionship between the level of adjusted 
means and level of aptitude; a + sign 
indicates that the high-aptitude group 
had the higher adjusted mean. The sta- 
tistical tests for the covariance analy- 
sis appear in the columns headed Fy, . 
These give the F ratios appropriate to 
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TABLE 3 


Comparisons OF PERSONALITY Test SCORE CHANGES OVER Two YEARS FOR 
Hieu- AND Low-APrrTUDE FEMALES: STUDIES 1, 2, AND 3 


Test variable Ae Mx My SDy | My adj | A cov Pu N 
Study 1 High | 76.7 | 69.0 | 22.7 | 75.5 100 
E Scale - 1.24 
Low 99.3 | 85.5 | 26.1 | 79.0 100 
Study 2 High | 147.5 | 136.2 | 23.9 | 140.2 210 
D Scale - 45 
Low | 161.5 | 145.8 | 28.3 | 141.7 210 
High | 69.3 | 62.7 | 17.4 | 68.9 210 
E Scale — | 712" 
Low 90.8 | 79.6 | 22.7 | 73.4 210 
High | 111.2 | 97.9 | 21.7 | 103.6 210 
F Scale - 2.21 
Low | 128.5 | 112.3 | 27.7 | 106.6 210 
Study 3 High | 25.1 | 25.3 | 5.4 | 242 188 
CPI: Sy Hu .29 
Low 21.9 | 22.8 | 5.6 | 23.9 185 
High | 27.5 | 284 | 7.5 | 28.5 188 
CPI: Se 2:80 
Low 28.1 | 29.6| 8.3 | 29.5 185 
High | 20.5 | 22.2 | 3.5 | 20.9 188 
CPI: Ai + | 29.08** 
Low 15.8 | 17.6 | 3.8 | 19.0 185 
High 38.6 | 40.6 | 4.9 | 39.0 188 
CPI: Ie + | 21.08** 
Low 32.7 | 35.0 | 5.9 | 36.6 185 
High | 32.5 | 33.0 | 4.0| 32.1 188 
CPI: Re gaii [hias 
Low 29.2 | 30.4 | 4.8 | 31.4 185 
High | 39.3 | 40.2 | 7.4 | 40.1 186 
AVL: Theor + 8.83** 
Low 38.9 | 38.1 | 6.5 | 38.3 173 
High | 36.7 | 35.8 | 7.5 | 36.4 186 
AVL: Econ - 20.74** 
Low 39.4 | 40.2 | 6.8 | 39.6 178 
High | 41.5 | 43.5 | 9.6 | 421 186 
AVL: Aesth + | 10.11** 
Low | 37.1 | 38.0 | 8.6 | 39.5 178 
High | 39.5 | 40.7 | 6.9 | 41.2 186 
AVL: Soc + 36 
Low | 41.7| 414 | 6.7 | 40.8 173 
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TABLE 3—Continued 
Test variable Ape rM SDy | Myadj |A cov] Fy N 
High 37.9 37.1 T. 
AVL: Polit Olea eed PA ge 
Low 39.3 39.3 5.7 38.9 173 
High 45.0 42.5 9.6 42. 
AVL: Relig ^ - 2.19 a 
Low 43.2 42.6 7.9 43.1 178 
High 148.1 | 139.7 | 26.8 | 144.0 
D Scale > 9.81** aut 
Low 165.4 | 155.9 | 26.1 | 151.7 187 
*p = .05. 
** » = 0l. 
testing homogeneity of adjusted ferences were significant: Es (.01), 


means,” 

Differences between adjusted means 
proved to be significant in 19 out of 
40 possible instances; 10 of these were 
with male Ss and nine with female Ss. 
In the tables, significance at the .05 
level is indicated by an asterisk fol- 
lowing the F ratio; a double asterisk 
indicates significance at the .01 level. 
For males, the following differences 
were significant: F> (.01), CPI vari- 
ables Ai (.01), Ie (.01), and Re (.05), 
AVL variables T (.01), E (.01), A 
(01), S (.05), and R (.01), and Ds 
(.01). For females, the following dif- 


"Since homogeneity of slope of regression 
of final score on initial score is a precondi- 
lion to the test of significance between ad- 
Justed means, F ratios appropriate to testing 
homogeneity of slopes were calculated. For 
males, one out of 20 F ratios (AVL S: Study 
3) was significant at the .05 level as was one 
of 20 for females (AVL T: Study 3), and 
one for females was significant at the .01 
level (F Scale: Study 5). In view of the 40 
Opportunities for significance to be dis- 
Covered, and in view of the lack of con- 
sistency from study to study, it is not un- 
Teasonable to conclude that the significant 
Instances were false positives attributable 
to chance. On these grounds, it was deemed 
appropriate to consider the evidence of all 
F ratios conducted on adjusted means. 


CPI variables Ai (.01) and Ie (.01), 
AVL variables T (.01), E (.01), A 
(.01), and P (.05), Ds (.01), and E; 
(.05). 

Differences significant at the .05 
level number only four for the entire 
study. It is probably best to give them 
little attention since one or more could 
readily oceur by chance, and since the 
actual difference in adjusted means 
in each case is less than one-quarter of 
the standard deviation of the variable 
in question. 


Direction of Change in Relation to 
Aptitude Level 

The direction of the outcome of each 
test is most easily studied by noting 
entries in the column headed Agoy. It 
will be remembered that “+” indicates 
a higher adjusted mean score for the 
high-aptitude group, and therefore 
more relative “gain” in score on the 
nonintellectual variable for more apt 
Ss. It appears that the direction of the 
difference is related to the general 
tendency for the score on the non- 
intellectual variable to increase or de- 
crease from initial test to retest. To 
form a rough index, one may sum the 
initial means for the two aptitude 
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TABLE 4 


Comparisons or PERsONALITY TEsT SCORE CHANGES Over Four YEARS ron 
Hicn- AND Low-APrrTUDE MALES AND FEMALES: STUDIES 4 AND 5 


Test variable Sade Mx My SDy | My adj | Acov Pu N 
Males 

Study 4 High | 78.6 | 64.7 | 17.5 | 69.1 34 

E Scale - 2.26 
Low 98.0 | 80.6 | 22.9 | 76.1 34 
Study 5 High | 153.6 | 140.2 | 26.9 | 143.9 112 

D Scale + .31 
Low | 165.9 | 146.0 | 24.7 | 142.2 1H 

piam 

High | 80.7 | 69.2 | 23.6 | 73.8 112 

E Scale - .95 
Low 94.8 | 81.2 | 25.6 | 76.0. i 
High | 116.7 | 98.2 | 23.7 | 103.5" 112 

F Scale - .53 
Low | 131.9 | 110.9 | 23.6 | 105.5 110 

Females 

Study 4 High | 77.0 | 59.6 | 24.1 | 66.5 35 

E Scale + 39 
Low 96.1 | 71.4 | 28.7 | 64.3 34 
Study 5 High | 148.0 | 132.1 | 27.4 | 137.2 146 

D Scale — .39 
Low | 164.1 | 144.0 | 29.3 | 138.9 146 
High 69.6 | 58.6 | 19.4 | 65.3 146 

E Seale = 5.45* 

Low 89.8 | 77.2 | 24.9 | 70.4 146 
High | 111.8 | 92.4 | 22.8 | 99.2 146 

F Scale - 3.39 
Low | 130.4 | 110.9 | 30.7 | 104.1 146 

*p= 05. 


groups and do likewise for the retest 
means. If the question is then raised 
as to the number of instances in which 
a “general” increase in score over time 
is coupled with the occurrence of a 
higher adjusted mean score for the 
high-aptitude group and in which a 
“general” decrease is associated with a 
lower adjusted mean score for the 
high-aptitude group, it will be found 
that this condition exists in 33 out of 
the 40 possible instances in the study. 
The seven exceptions are: the CPI 
variable Sc for both males and fe- 


males, the AVL variables T, E, and S 
for males, E, for females, and Ds for 
males. Of the exceptions, only the last 
two are characterized by substantial 
change over time. If this analysis is 
reduced to consideration of the 15 in- 
stances in which a difference between 
aptitude groups was found to be sig- 
nificant at the .01 level, it will be ob- 
served that in 13 cases this same rela- 
tionship prevails. In the remaining 
two (AVL variables T and E for 
males), the general shift in level of 
scoring over time is minimal. In other 
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words, when there is an overall change, 
there is a substantial tendency for the 
high-aptitude group to show it more 
strongly. 


Change as a “Psychologically Posi- 
tive” Phenomenon 


There seems to be general agreement 
among psychologists that lower D, 
E, or F Scale scores indicate more 
“psychologically positive” behavior 
than do higher scores. For these vari- 
ables, the aptitude group with the 
lower adjusted My score was desig- 
nated as the group with the more “psy- 
chologically positive” behavior. 

Either explicit or implicit in the 
numerous validity studies reported by 
Gough (1957) for the CPI scales is 
the notion that higher Sy, Sc, Ai, Ie, 
or Re scale scores indicate more “psy- 
chologically positive” behavior than 
do lower scores. For the CPI variables, 
the aptitude group with the higher ad- 
justed My score was designated as the 
group with the more “psychologically 
positive” behavior. 

It is difficult to determine which 
scales of the AVL are reasonably sub- 
ject to this kind of designation. There 
does seem to be agreement among re- 
search psychologists that higher scores 
on the AVL Theoretical (T) and 
Aesthetic (A) scales are more “val- 
ued" in samples of bright and well- 
educated Ss than are lower scores. This 
value judgment is shared by the pres- 
ent writers, also. Therefore, for the T 
and A scales of the AVL, the aptitude 
group with the higher adjusted My 
score was designated as the group with 
the more “psychologically positive" 
behavior. No decision was made for 
the E, S, P, or R scales of the AVL. 

Table 5 summarizes the results of 
the covariance analyses in a manner 
designed to facilitate study of the re- 
lationship between differences in ad- 
Justed performance level of the two 
aptitude groups on each variable and 


the characteristie of "psychological 
positiveness." The question under con- 
sideration is whether high-aptitude 
groups have the more “psychologically 
positive” nonintellectual scale scores 
at the end of 2 or 4 years. 

Consideration of the table shows 
that 32 of the 40 comparisons have a 
designation of more “psychologically 
positive” score, in accord with the des- 
ignations specified in the paragraphs 
immediately preceding. For males, 14 
of the 16 available comparisons favor 
the high-aptitude group. Of the 16 
comparisons, seven showed a signifi- 
cant difference between high- and 
low-aptitude groups, and in all seven 
the high-aptitude group is favored. 
For females, the same figures are de- 
scriptive of the results. Fourteen of 
the 16 available comparisons favor the 
high-aptitude group, and of the 16 
comparisons, the seven showing a sig- 
nificant difference between high- and 
low-aptitude groups all favor the 
high-aptitude group. 

There is, therefore, strong evidence 
that high aptitude is associated with 
growth in a “psychologically positive” 
direction. The trend in development 
of nonintellectual characteristics ap- 
pears to be similar to that of the de- 
velopment of intellectual characteris- 
tics in that for brighter young people 
greater development in a desirable di- 
rection takes place over time. 


Comparison of the Sexes 

Examination of the 20 parallel com- 
parisons (same study and variable) 
between males and females shows that 
the direction of difference between 
high- and low-aptitude groups (Acoy) 
was the same in 17 instances. The ex- 
ceptions were the AVL variable S8, and 
E, and D; . It may be noted that none 
of the 15 differences significant at the 
.01 level was involved in these excep- 
tions. 

Of the 16 parallel comparisons in 
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TABLE 5 


Summary OF COVARIANCE ANALYSES AND “Most PSYCHOLOGICALLY Posrtrvz”’ 
Test SCORES ror ALL Groups AND ALL STUDIES 


Males Females 
Variable Aptitude group | Aptitude group | Aptitude group | Aptitude group 
with highest | up with most with highest | ,, With most 
My m pae My adi.| Mya. | nuehologinll 

Study 1 

E Low High Low High 
Study 2 

D Low High Low High 

E Low High Low** High 

F Low** High Low High 
Study 3 

CPI Sy High High High High 

CPI 8c Low Low Low Low 

CPI Ai High** High High** High 

CPI Ie High** High High** High 

CPI Re High* High High High 

AVL Theor High** High High** High 

AVL Econ Low** ? Low** ? 

AVL Aesth High** High High** High 

AVL Soc Low* ? High ? 

AVL Polit Low ? Low* ? 

AVL Relig Low** ? Low ? 

D Low** High Low High 
Study 4 

E Low High High Low 
Study 5 $ 

D High Low Low High 

E Low High Low* High 

F Low High Low High 

*p = .05. 
"*» Ol. 


Which it seemed possible to designate 
the direction of psychologically desir- 
able development, 14 were instances 
in which the aptitude group having 
the more psychologically positive ad- 
justed mean was the same for males 
and females. The exceptions were E; 
and Ds. 


Summary AND CONCLUSIONS 


All investigators of the effects of 
college attendance upon student per- 
sonality have reported measurable ef- 
fects of a generally desired sort (e.g., 
reduction in prejudice). Local studies 
have demonstrated that similar per- 
sonality changes occurred in young 
adults with no college attendance, 


some college attendance, and with the 
maximum amount of college attend- 
ance possible in 2-year and 4-year 
periods. 

In light of the empirical facts that 
college or college-bound Ss are 
brighter than average, and that in- 
tellectual changes continue well into 
adulthood for Ss who were initially 
brighter than average, a series of stud- 
ies was undertaken to determine if the 
trend in nonintellectual development 
was similar to that of intellectual de- 
velopment for initially brighter-than- 
average Ss. If so, it was felt that this 
might account in large part for the 
positive findings reported in the “col- 
lege impact” research. Existing data 
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from five separate longitudinal studies 
of college-bound young adults were 
utilized to determine whether or not 
nonintellectual changes occurred dif- 
ferentially for groups of differing 
academic aptitude. Measures involved 
were the Ethnocentrism Scale, the 
Dogmatism Scale, the Authoritarian- 
ism Scale, five scales from the Cali- 
fornia Psychological Inventory, and 
the six scales of the Allport-Vernon- 
Lindzey Study of Values. 

Considering that separate studies 
were conducted for males and females, 
40 comparisons of nonintellectual 
change over time for high- and low- 
aptitude groups were available. Re- 
test score means were adjusted by 
covariance procedures for initial test 
score, and significance tests were per- 
formed. Of the 40 comparisons, 15 
were significant at the .01 level, and 
four were significant at the .05 level. 

In 33 of 40 possible instances, an 
overall inerease in score over time was 
coupled with evidence of greater in- 
crease for the high-aptitude groups, 
or an overall decrease in score was 
coupled with greater relative decrease 
for the high-aptitude groups. It ap- 
peared possible to make a judgment 
as to which of a pair of adjusted means 
in a given comparison represented the 
more “psychologically positive" devel- 
opment for 32 of the 40 comparisons. 
Twenty-eight of these favored the 
high-aptitude group. 

. The following conclusions appear 
Justified: 

1. There is substantial evidence to 
Suggest that there are differences be- 
tween more and less apt young adults 
in the relative degree of change in 
nonintellectual scale scores over time. 
The findings exhibit a substantial 
tendency for young adults of higher 
aptitude to exhibit more nonintellec- 
tual change over time and in the di- 
Tection of the trend of college students 
In general. 
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2. There is strong reason to believe 
that nonintellectual changes which oc- 
cur over time among young adults are 
generally of a more “psychologically 
positive” nature as defined by values 
of our culture or of the subculture of 
educated persons, 

3. There is strong evidence that 
high-aptitude groups of young adults 
have substantially more “psychologi- 
cally positive” nonintellectual devel- 
opment over time. 

4. The evidence indicates that non- 
intellectual development is similar in 
its trend to that of intellectual devel- 
opment for initially brighter-than- 
average young adults. 

5. The several results pertaining to 
nonintellectual development observed 
in this study appear to be similar for 
males and females. 

6. The results pertaining to non- 
intellectual development appear to be 
of some generality in regard to differ- 
ent samples of young adults and dif- 
ferent kinds of nonintellectual meas- 
ures. 
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LEARNER COMPETENCE, MODEL COMPETEN CE, 


AND NUMBER OF OBSERVATION TRIALS 
IN VICARIOUS LEARNING! 


FREDERICK H. KANFER AND PRYSE H. DUERFELDT 
University of Oregon Medical School 


The effects of vicarious learning were studied in a paired associate 
nonsense syllable task under conditions of varied model competence, 
varied subject competence, and varied numbers of vicarious trials, 
120 Ss heard a model who was at an early or a late stage of learning. 
The model was heard at either an early or a late stage in Ss’ learning, 
and Ss heard the model for either 1 or 3 of 10 trial blocks. 15 control Ss 
learned only under direct reinforcement. Ss exposed to the model early 
in acquisition learned significantly better than Ss exposed late. Model 
competence and duration of exposure did not affect learning signifi- 
cantly. The results suggested that vicarious trials late in acquisition 
had a disruptive effect, while early vicarious exposure yielded benefits 


similar to those of direct reinforcement trials. 


While observational learning has 
long been used as a teaching adjunct 
in educational practices, attention to 
the specific parameters of this vi- 
carious learning process has been re- 
cent. Studies by Bandura (1962, 1965) 
and his coworkers have demon- 
strated the learning of novel responses 
and the inhibition and extinction of 
previously learned responses as a 
function of observation of a model. 
Kanfer and Marston (1963) have 
compared the effectiveness of vi- 
carious and direct reinforcement and 
have shown that the percentage of 
Teinforcement affects the observer's 
io (Marston & Kanfer, 


Since a subject’s (S's) disposition to 
attend to the model’s behavior is 
crucial to his reception and utilization 
of information, it is apparent that the 
relationship between an S and a model 
Tepresents an important variable in 
Vicarious learning (Bandura, 1965). 
One aspect of this relationship, the 
pour 


"This study was supported in part by 
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apparent competence of the model, 
was examined by Rosenbaum and 
Tucker (1962). They found that 
greater apparent competence of the 
model enhanced learning in the ob- 
server. These results were attributed 
to the fact that S's prior social history 
affects his use of the information pro- 
vided by the model, that is, S has 
learned that more competent models 
provide more efficient and reliable in- 
formation to the observer, A similar 
conclusion was reached by Bandura 
and Kupers (1964) in a study com- 
paring the effectiveness of adult and 
peer models on imitated behavior in 
children. 

From the learner’s viewpoint, his 
own competence level relative to the 
model would further modify the rela- 
tionship. In observing a relatively in- 
competent model, S would be expected 
to use the available intertrial time for 
rehearsal of his already acquired re- 
sponses rather than attending to the 
model’s performance for additional in- 
formation. In fact, continued attention 
to a model who makes predominantly 
incorrect responses should result in an 
interference effect on S. 
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The present study was designed to 
consider these several competence pa- 
rameters of the model-subject rela- 
tionship with competence defined not 
by model status or experimenter in- 
structions, but by the model’s actual 
performance. In a three-factor design 
the following variables were studied 
with a verbal paired associate learn- 
ing task: (a) the number of observa- 
tion trials, (b) the acquisition stage 
(competence) of the learner, and (c) 
the acquisition stage of the model. 


Merxop 


Preparation of Model Tapes 


The learning task consisted of a list of 
10 pairs of nonsense syllables of mean asso- 
ciation value. The list was taken from Hov- 
land’s (1939) list. Pairs which showed the 
greatest variability in being learned or not 
learned and pairs in which one member was 
similar in pronounciation to that of another 
pair were discarded. This selection yielded 
a list of 10 paired associate nonsense sylla- 
bles which constituted S's learning task. Ten 
random orders of presentation were repeated 
for 40 trials. There was a 3-second interval 
between presentation of syllable pairs, a 
8-second interval between presentation of 
the stimulus member and response member 
of each pair, and a 2-second exposure of 
the pair. At the end of each list of 10 pairs 
a 4-second intertrial interval was given. Five 
presentations of the list constituted one 
block of trials. 

Errors on the taped presentations were 
taken from actual performances obtained 
from pilot Ss. Thus, both intrusions and 
partially correct syllables appeared. For the 
model at an early learning stage, the tapes 
contained the following number of correct 
responses on each trial: Trial 6:1; 7:1; 8:2; 
9:3; 10:3. On Trial 11:3; 12:3; 13:3; 14:4; 
15:4. On Trial 16:5; 17:5; 18:5; 19:6; 20:6. 
For the tapes representing a model at a 
late learning stage, the following number of 
correct responses were used on each trial: 
21:6; 22:6; 23:6; 24:6; 25:7. On Trial 26:7; 
27:7; 28:8; 29:8; 30:8. On Trial 31:8; 32:8; 
33:8; 34:8; 35:8. The tapes represented ac- 
quisition curves with 1-6 correct responses 
per block for the model at the early acquisi- 
tion stage, and 6-8 correct responses for the 
model representing a later learning stage. 

The tapes were prepared by the first au- 
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thor and a graduate assistant. They repre- 
sented two variables of the design, the stage 
of model learning, and the number of ob- 
servational trials. The entire learning task 
was taped so that each group heard the 
stimulus word, an interval for S's response, 
and the stimulus-response pair on each trial, 
On those blocks on which vicarious experi- 
ence was offered, S also heard a model re- 
sponse between the stimulus and the stimu- 
lus-response presentations. Following each 
set of 10 pairs, the reader on the tape said, 
*New trial." 


Experimental Design 


A control group learned the list without 
vicarious trials. The following groups con- 
stituted a 2 X 2 X 2 factorial design. In the 
group designation the numbers refer to the 
blocks on which vicarious trials were given 
and the letters E or L refer to an early or 
late model stage of learning. For one-block 
observation, Ss were exposed to the model's 
performance on Block 3, either with a model 
simulating learning during the third block 
of acquisition (3-E), or at a late stage, Block 
6 of acquisition (3-L). Two additional groups 
were given the one-block vicarious experi- 
ence during Block 6, either with an early 
model (6-E), or with a late model (6-L). 

For groups with three blocks of observa- 
tion, the same pattern was replicated. Two 
groups heard either an early model tape, 
simulating Blocks 2, 3, and 4 of acquisition 
(2, 3, 4-E), or a late model simulating 
Blocks 5, 6, and 7 of acquisition (2, 3, 4-L). 
Two groups obtained their vicarious experi- 
ence on Trials 5, 6, and 7, either with an 
early model (5, 6, 7-E), or with a late model 
(5, 6, 7-L). 


Subjects 


All Ss were volunteer college students 
who were paid for participation in this 
study. One hundred and thirty-five Ss were 
run, with random assignment of 15 Ss per 
group. 


Procedure 


The S was comfortably seated in an ex- 
perimental room facing a partition behind 
which the experimenter (E) operated a tape 
recorder and recorded S's responses. Prior 
to S's arrival, E selected the appropriate 
tape. The instructions were taken in part 
from Hovland (1939), modified to fit the 
present study. All Ss were told that this 
was an experiment in learning nonsense 
syllables. They would hear pairs of nonsense 
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syllables over a tape recorder and their 
job was to learn the pairs so that upon 
hearing the first syllable, they would be 
able to respond immediately with the sec- 
ond. All Ss, except those in the control 
group, were also told: 


People find it easier to learn this kind 
of task when there is a break. Sometime 
at the end of one of the trials I will inter- 
rupt and tell you that it is time for the 
break, Since many people rehearse during 
this break anyway, we have decided to 
let you listen to a subject who took part 
in this experiment the other day, so that 
everybody gets some benefit from the 
break. Then I will tell you when it is time 
to begin again. 
The Ss were encouraged to speak clearly 
and told that it was a very difficult task. 

At the beginning of a vicarious learning 
section, Ss were told: “All right, it’s time for 
the break. Listen, but don’t respond. I will 
tell you when the break is over.” At the 
end of the vicarious learning section they 
were told to respond again, This experi- 
mental design thus yielded learning data 
only for those blocks on which S was re- 
quired to respond. 


RESULTS 


All S responses were recorded in 
blocks of five list presentations. An 
analysis of variance was run on the 
total number of correct responses over 
all blocks in each group in order to 
determine the effect of sex differences. 
None of the analyses revealed a signi- 
ficant difference between male and 
female students in each of the groups. 
Therefore, data from both sexes were 
pooled for subsequent analyses. 

Since Ss in various groups were ex- 
posed to either one or three blocks of 
Vicarious learning during which no 
Tesponses were given, and since these 
Periods occurred at different stages of 
learning, the main effects of the treat- 
ment variables could best be compared 
by an analysis of the final acquisition 
block (Block 8). A factorial analysis 
Was carried out for exposure (one 
block versus three blocks), learner 
Stage (early versus late); and model 
stage (early versus late). 


The analysis of variance indicated 
that duration of exposure to the taped 
model did not affect learning. There 
was also no difference between Ss who 
listened to an early model and those 
who listened to a late model who was 
more competent on this task. The only 
significant difference (F = 438, p < 
05) was found between groups in 
which Ss were exposed early during 
acquisition and those who were ex- 
posed near the end of acquisition 
training (learner stage). The Ss who 
heard the model early gave more cor- 
rect responses than Ss who had the 
same experiences at a later learning 
stage. The combined early stage 
groups yielded a mean of 44.4 correct 
responses on Block 8, while Ss who 
were in groups with vicarious learning 
during later acquisition blocks gave 
a mean of 40.4 correct responses. A 
comparison with the control group 
average of 43.7 correct responses for 
Block 8 suggests that the obtained 
difference was due primarily to poor 
learning in groups which were exposed 
to a model tape late in acquisition. 
'This finding suggests that vicarious 
learning may be more effective in 
earlier stages than in later stages. 

An alternate explanation of these 
findings might be that the vicarious 
experience had a disruptive effect, and 
Ss in the late groups were exposed to 
this disruption after more learning and 
closer to the test for acquisition in 
Block 8. The disruption hypothesis 
was tested by comparing the perform- 
ance of the experimental groups on 
acquisition blocks immediately fol- 
lowing their vicarious learning expo- 
sure with the performance of the ap- 
propriate nonexposed control groups. 
An analysis of variance was carried 
out on Block 4 comparing Groups 3-E 
and 3-L with 6-E and 6-L. While 
the former groups had experienced 
viearious learning on a preceding 
block (Block 3) the later group served 


156 


as controls without previous vicarious 
experience. The analysis revealed no 
significant difference between these 
groups. A further test was carried out 
by an analysis of variance on Block 5 
for Groups 2, 3, 4-E and 2, 3, 4-L 
versus 6-E and 6-L. This analysis also 
failed to reveal significant differences 
between the groups. For Ss who re- 
ceived the vicarious experience early 
in acquisition, either for one or three 
blocks, there was no apparent disrup- 
tion effect and their performance was 
equal to that of Ss who were never 
exposed to vicarious learning up to 
that stage. 

Comparisons for each of the trial 
blocks were not possible since all 
groups yielded data only on Blocks 1 
and 8, However, a further test of the 
effects of the timing of vicarious learn- 
ing was made by comparing 3-E and 
3-L with 6-E and 6-L on Block 7. A 
factorial analysis of variance was 
carried out for these four groups with 
learner stage and model stage as the 
main factors. The analysis yielded a 
significant F-ratio (412, p < .05) 
only for the learner stage. The Ss 
who heard the model on Block 3 
yielded more correct responses than Ss 
who had the vicarious experience on 
the preceding (sixth) block. These 
supplementary analyses suggest that 
exposure to models had a disruptive 
effect only during late acquisition 
trials. Thus, the disruptive effect was 
not simply a function of the interval 
between the vicarious experience and 
the block on which the performance 
was analyzed. The results tend to con- 
firm the hypothesis that a disruptive 
effect occurs mainly when Ss are near 
mastery of the learning task. These 
results seem to be unaffected by the 
relative competence of the taped 
model. 
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Discussion 


The results of this study confirm 
earlier observations (Bandura, 1965; 
Kanfer, 1965) that knowledge of the 
specific parameters of vicarious learn- 
ing is an essential step toward its re- 
finement as a practical educational 
tool. The finding that Ss derive more 
benefit from observational learning 
during the early, rather than the later, 
stages of their attempts to master a 
task clearly suggests that the efficient 
use of such techniques as motion pic- 
tures or demonstrations as training 
aids is dependent upon their time of 
presentation. The correlated finding 
that an S near mastery of a task is 
hindered by observing a model at or 
below his level of competence suggests 
that pursuit of such a procedure as an 
educational adjunct may discourage 
student and teacher alike. 

The fact that no differences were 
found between the direct learning and 
either of the early observational learn- 
ing groups supports past findings 
(Berger, 1961; Kanfer & Marston, 
1963) that observation of a model 
ean effectively replace much of the 
trial and error learning typically uti- 
lized in developing task mastery. In 
conjunction with the finding that the 
amount of direct learning replaced by 
observation did not differentially af- 
fect performance, a tentative practical 
implication is suggested: When ob- 
servational learning is more economi- 
eal in terms of time, money, danger, 
etc., than direct learning, it may effec- 
tively replace large blocks of the trial 
and error method and promote the 
mastering of a task with equal eff- 
ciency. 

Unlike other studies in which model 
competence has proven an important 
variable (Bandura & Kupers, 1964; 
Rosenbaum & Tucker, 1962), "com- 
petence" was here defined by the 
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accuracy of the model's performance 
(as determined by E), not by his 
status or implied response accuracy 
(as appraised by S). Unless S views 
the model as highly competent or 
highly incompetent it appears unlikely 
that he attends differentially to the 
models at different, competence levels, 
regardless of their accuracy. This post 
hoc explanation of why model com- 
petency was not an effective variable 
in the present study is given some sup- 
port in a recent unpublished disserta- 
tion by Craig (1964). It was found 
that with model accuracy held con- 
stant, Ss requested assistance from 
models labeled as highly competent 
significantly more than from models 
labeled as low in competence, An im- 
portant factor which may influence the 
contributions of observed trials to 
learning is S’s set as controlled by 
instructions. The present study maxi- 
mized the effects of the vicarious trials 
by implying the potential benefits of 
utilizing the interval designated as a 
"break." Together with the fairly 
high motivation in the volunteer col- 
lege student population sampled here, 
these variables would tend to reduce 
differences in model competence and 
to maximize vicarious learning. The 
present findings must therefore be 
limited in generalization to similarly 
favorable instructional conditions. The 
study demonstrates that observational 
learning can be an effective replace- 
ment for much of traditional trial and 
error learning when the knowledge of 
its parameters are taken into con- 
Sideration. 
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159 college freshman males participated in experiments concerned with 
the effect on original behavior, as measured by 2 tests from Guilford's 


originality test battery (Unusual Uses and Plot Titles), of the fol- 
lowing variables: (a) instructions to be original on the Unusual Uses 


tests; (b) word-association training; and (c) training in the use of 
problem-solving heuristics for the Unusual Uses test. Major hypotheses, 
based on previous literature, were that all variables would significantly 
facilitate performance on the Unusual Uses test, and that the 2nd vari- 
able would significantly facilitate performance on the Plot Titles test. 
Although these hypotheses were confirmed, a further hypothesis deal- 
ing with word-association training, derived by Maltzman (1960) from 
S-R principles, was not supported. The results suggest a testable ap- 


proach to operational training of originality. 


This study reports two experiments 
in which an attempt was made to en- 
hance creative performance. Those 
tests developed by Guilford (1950, 
1957) which call for divergence of 
response to relatively simple tasks 
were used. Maltzman (1960), in a re- 
view of the literature, points out that 
early attempts to enhance creativity 
(e.g., Osborn, 1957) were nonsystem- 
atic in nature. Defining originality as 
statistically infrequent but relevant 
behavior, he has reported a series of 
experimental attempts to produce 
original behavior by using a modified 
free-association technique (Maltzman, 
Simon, Raskin, & Licht, 1960). In- 
creases in scores on the Guilford Un- 
usual Uses (UUT) and Word Asso- 
ciation (WAT) tests were obtained 
with prior repetition of word associa- 
tion in these experiments, but Gallup 
(1963) subsequently obtained similar 
increases with prior arithmetic and 
vocabulary rehearsal, casting doubt 
on the assumption that the training 
needed to be task-relevant. Rather the 
possibility that subjects (Ss) changed 
sets from speed to originality is raised. 
Early work by the Guilford group 
had already suggested this effect 
(Christensen, Guilford, & Wilson, 


1957). Rosenbaum, Arenson, and Pan- 
man (1964) used both instructions to 
make “original” word association and 
five repetitions of word association, 
and found a training effect but no in- 
structional effect on total productivity 
which produced increased originality 
as a secondary effect. 

The experiments reported here test 
the major hypothesis that task-rele- 
vant principles exist for simple and 
complex tasks requiring divergence of 
response, that such principles can be 
deduced and expressed, that the princi- 
ples can be taught, and that Ss who 
master these strategies will outperform 
those who have not. Following Maltz- 
man et al. (1960) and Simon (1961), 
we also set out to replicate the word- 
association training effect, and to ex- 
plore the effects of repetition well 
beyond any they attempted with 
length of word list held constant. 
Given their findings it was hypoth- 
esized that repetition of association 
is an effective means of increasing 
scores on the Guilford UUT and Plot 
Titles Test (PTT). The influence of 
“set” is chiefly determined by the in- 
structions used, and to maximize such 
effects we introduced an instructional 
variable whereby the experimental 
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Exrecrs or TRAINING PROCEDURES ON CREATIVITY 


groups were given written instructions 
io perform as creatively as possible. 
By using two-way analysis of variance 
designs for each experimental condi- 
tion we were able to test for the in- 
fluence of instruetions, word-associa- 
tion repetition, and training in the 
“heuristics” of divergent thinking on 
the UUT and the PTT. It was ex- 
pected that all three variables would 
increase UUT scores, and that word 
association would produce significant 
carry-over to PTT scores; thus leaving 
the effects of instructions and heuris- 
tics training on the PTT matters for 
exploration. The design in Table 1 
shows that two distinct experiments 
were conducted, each testing for the 
effects of the instructional variable as 
it interacts with a different type of 
training. 

One persistent artifact in creativity 
research is that increasing the quan- 
tity of responses from Ss may produce 
increased frequency of originality. 
Christensen et al. (1957) have shown 
that this need not be so, but we pre- 
sent additional analysis for PTT data 
to assess the effects of increased levels 
of response. 


METHOD 


Subjects 


Amherst freshmen (N = 159) were ran- 
domly selected from the Amherst freshman 
picture book, randomly assigned to experi- 
mental or control conditions, and contacted 
individually. The contacting procedure was 
standardized, and if an individual could not 
come at his preassigned time, he was re- 
assigned to another time. Over 95% of those 
contacted participated. 


Dependent Variables 


In the UUT S is asked to write unusual 
Uses for given objects and not to write the 
Same use more than once. A revision of the 
standard instructions to the UUT was made 
in order to set Ss toward originality of re- 
sponse. The scoring procedure for the UUT 
Was kept similar to that of Maltzman and 
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TABLE 1 
THE EXPERIMENTAL DzsiGN, SHowine SIze 
OF GnouPs TESTED UNDER 
EACH CONDITION 


Training 
Instructions 

None Heuristics | ,, Word 

Standard | C1 X1 X3 
(N = 22) (N = 19) (N = 9) 

C3A X4 
(N = 12) (N = 14) 

Revised | C2 x2 X5 
(N = 18) (N = 28) (N = 8) 

C 3B X6 
(N = 14) (N = 15) 


Simon, by employing empirical norms col- 
lected by Maltzman et al. (1960), and count- 
ing responses which occurred not more than 
once in the normative population. A modi- 
fication was that certain responses (but very 
few) were eliminated as not acceptable ac- 
cording to Guilford’s Scoring Guide. 

The PTT consists of two story plots, for 
which Ss are instructed to write as many 
appropriate titles as possible in the time 
allowed. Two scores were obtained: number 
of titles (fluency) and originality. Plot titles 
were randomly divided among seven judges 
who were instructed to sort titles into three 
categories of quality as defined by Guil- 
ford’s Scoring Key. 


Training Materials 


Heuristics. A booklet of “strategies” of 
heuristic value in the UUT was written from 
suggestions of others and from the first au- 
thor’s own thinking, and revised through 
pretesting to include five strategies pre- 
sumed to be the clearest and most useful. 
The booklet contains illustrations of the use 
of each strategy and blank spaces for ap- 
plication to three common objects. For ex- 
ample, one strategy is the following: “Trans- 
form the object: burn it, cut it, paint it, 
etc. What uses of your object do these 
transformations suggest?” An example is: 
a brick (with a hole in it) as & pencil holder. 
A practice test with three new objects is 
included. 

Word association. The stimulus words 
were taken from those used by Simon (1961) 
and by Maltzman et al. (1960). A tape re- 
corder was used in order to present the 
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stimulus words uniformly over the condi- 
tions. 


Training 

The experimental treatments were ad- 
ministered about 2 weeks after contact was 
made. The two groups scheduled to receive 
heuristics training for the UUT (X 1 and 
X 2) came first on successive evenings. 
Word-association training was given to four 
groups (X 3, X 4, X 5, and X 6) on two 
different evenings. As shown in Table 1 the 
only difference between X 1 and X 2, be- 
tween X 3 and X 5, and between X 4 and 
X 6 was in the instruction variable. 

Heuristics training. Heuristics training 
was conducted in a manner that was judged 
by two confederates to be practically iden- 
tical in X 1 and X 2. The Ss were told that 
they would soon be given a test in which 
they were to write down unusual uses for 
certain objects, and that the following ses- 
sion would consist of the presentation of a 
group of strategies which are thought to be 
helpful in thinking of unusual uses for 
things. The booklet of strategies was distrib- 
uted. For each strategy Ss were to read the 
statement and the examples, and take 2 or 
3 minutes to practice using the strategy on 
their own. Afterwards the experimenter (E) 
called for examples of uses for each of the 
objects. The E tried to express no approval 
or disapproval. The training period was fol- 
lowed by a short practice test. 

Next the booklets were collected and the 
UUT and PTT were distributed in an at- 
tached form. The UUT is given in two parts 
of 5 minutes each, and the PTT is given in 
two parts of 3 minutes each. The instruc- 
tions were the standard written instructions. 

After both tests Ss were asked to indicate 
by the appropriate number which strategies, 
if any, they consciously employed for each 
use they listed. Sheets with spaces exactly 
corresponding to spaces on the UUT and 
headed with the numbered strategies were 
used. These sheets conveniently led to a 
"strategy score" for each S consisting of a 
tally of the instances in which strategy em- 
ployment and unique uses were indicated on 
corresponding spaces. 

Word-association training. The procedure 
for word-association groups was the same 
except that length of training was either 12 
or 16 trials. The Ss were informed that they 
would first take a word-association test. 
They were told that the tape recorder would 
read a list of six words a total of either 12 
or 16 times. Their instructions were to asso- 
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ciate to each word by writing down in the 
appropriate space the first word that came 
to mind, and to respond after the first read- 
ing with a different word than the ones they 
had used before. Trials were separated only 
by a pause and the recorded number of the 
trial. 

After this training, E explained that a new 
list of six words would be read, and they 
were again to write down the first word 
that came to mind. Subsequently, the UUT 
and PTT were administered exactly as they 
had been for heuristics groups. 

Next the booklets of strategies were dis- 
tributed with the explanation that these 
booklets had been presented to other stu- 
dents to help them think of unusual uses. 
The E read the booklet to Ss as they fol- 
lowed. Then the booklets were collected and 
Ss were asked to indicate strategy employ- 
ment just as the heuristics-trained groups 
had done. 


Control Groups 


There were three control groups (C 1, 
C 2, and C 8), the first two of which were 
controls for all training effects on the two 
posttests, while the third controlled for ef- 
fects of word-association training on the 
WAT. 

The procedure for C 1 and C 2 was 
identical except that C 2 took the revised 
form of the UUT. The UUT and PTT were 
administered in the same manner as for the 
experimental groups. The heuristics strate- 
gies were next introduced exactly as for the 
word-association groups, and Ss were asked 
to indicate the strategies which they had 
employed. 

C 3 was originally scheduled to take only 
the WAT. However, in order to enlarge the 
number of control Ss for the UUT, the 
UUT was subsequently administered, ap- 
proximately half taking each form of the 
test. 


RESULTS 


For the analysis the experiment us- 
ing heuristics training will be referre 
to as Experiment I and the one using 
word-association training will be re- 
ferred to as Experiment II. It will be 
convenient to divide the presentation 
of results between the results for the 
UUT (including strategy data) and 
the results for the PTT. 
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Results for the UUT 


The scoring procedure described 
above was used to find the total num- 
ber of unique uses for each S, where 
uniqueness was defined as occurring 
not more than once in the norms of 
Maltzman et al. (1960). Since the 
frequency distributions were highly 
skewed, a square root transformation 
(VX .5) was applied to the scores. 

Results of the analysis of variance 
of UUT results appear in Table 2. In- 
structions were effective at the .05 
level of confidence in both Experiment 
I and Experiment II. Word-associa- 
tion training also had a significant ef- 
fect on scores (p < .05). Heuristics 
training proved to have the most 
highly significant effect (p < .001). 
Interaction was negligible in both ex- 
periments. 

These results confirm our hypothe- 
ses for the UUT concerned respec- 
tively with heuristics training) word- 
association training, and instruction. 
Remaining to be considered are the 
analysis of strategy scores and a sub- 
hypothesis suggesting that extended 
word-association training will heighten 
the effects on creative performance. 

The first evidence of strategy em- 
ployment was the proportions of 


*To explore the possible effects of ex- 
tended heuristics training, five Ss were re- 
cruited from a psychology class at Amherst. 
This class had been administered a form of 
the UUT with new objects in a standard 
Procedure, The Ss participated in a proce- 
dure aimed at improving their performance 
On a retest of the UUT in which they 
worked on their own through a longer form 
of the booklet of strategies. After 1 month, 
the standard UUT was administered by 
standard procedures. Scoring was standard 
On the posttest, but the pretest had to be 
Scored from norms which were collected 
from the sample itself. The reliability co- 
efficient for the two forms of the UUT was 
+90. All Ss but one improved markedly, 
Creating the impression that extended ac- 
Quaintance with strategies can have marked 
facilitating effects. 


TABLE 2 
ANALYSIS OF VaRIANCE on UUT Scores 
Source 
df | MS F 
Experiment I 
Instructions (A) 1) .11 | 4.23* 
Heuristics (B) 1 .77 |29.62** 
AXB 1| .01 = 
Error 109| .026 
Experiment II 
Instructions (A) 1| .32 | 5.82* 
Word Association (B) 2| .28 | 5.09* 
AXB 2| .005| — 
Error 106| .055 


Note.—This analysis was done by the 
method for unequal Ns presented by Walker 
and Lev, 1953, pp. 381-382. 

* p< .05. 

** p< .001. 


unique uses which involved the use of 
strategies in each group, that is, the 
ratio of total strategy scores to total 
number of unique uses, These ratios 
ranged from .42 to .64. Second, it was 
found by a í-test between X 2 and 
C 2 in mean strategy scores, at a sig- 
nificance level of p < .01, that heu- 
ristics treatments were associated with 
more successful strategy employment 
than control treatments. 

To show the results of the extension 
of Simon's (1961) study the percent- 
age of control performance gained by 
each amount of word-association repe- 
tition is displayed in Figure 1, Esti- 
mates from Simon's data are compared 
with our findings for the 12 and 16 
levels under standard instructions, Ap- 
propriate t-tests showed the follow- 
ing: Both levels differ significantly 
from the control level, but the differ- 
ence between them is not significant; 
only the 16 level was significantly 
greater under revised instructions. 
Thus, we have no evidence here that 
extended experience of word associa- 
tion will increase UUT performance 
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— Simon, 1961 


Ridley, Birney 


ah 


NUMBER OF REPETITIONS OF WORD ASSOCIATIONS 


PERCENT. GAINED 
(bowed on Wansformed sceres) 


Fic. 1. Percentage of Control perform- 
ance on UUT gained for each level of word- 
association rehearsal—no data for 10 and 14 
repetitions. (Estimates from Simon’s, 1961, 
data disregard length of word list.) 


beyond that obtained with eight repe- 
titions. 


Results for the PTT 


The scoring procedure presented 
above was used and only the responses 
to one of the two plots were scored. 
Reliability estimates for the seven 
judges were formed on the basis of 
scores for 40 titles which all judges 
scored in common. With one exception, 
these reliabilities fall between .50 and 
.75, which are comparable to relia- 
bilities obtained by Wilson, Guilford, 
and Christensen (1953). 

Since again the frequency distribu- 
tion of scores were skewed, a square 
root transformation (V X --.5) was 
applied to all scores. 

Analyses of variance for Experi- 
ments I and II on the PTT are sum- 
marized in Table 3. The instruction 
variable failed to show a significant 
effect in either experiment, There were 
significant effects of heuristics (p < 
.05) and word-association training (p 
« .01). Again, interaction was zero or 
negligible. 
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These data demonstrate the hy- 
pothesized effect of word-association 
training. Since our inspection of the 
other two variables for PTT analysis 
was regarded as exploratory, the heu- 
ristics effect and the lack of an in- 
structional effect will require consider- 
ation later. We have yet to examine 
the relation between effects achieved 
on fluency and originality. 

A t-test between the mean fluency of 
pooled control groups and of pooled 
experimental groups (excluding the 12 
repetition row) disclosed that the lat- 
ter had significantly higher production 
of titles (p < .05). The relevant ques- 
tion is whether the treatment in- 
fluenced both fluency and originality 
or whether the higher originality was 
an artifact of the unexpected fluency 
effect. Product-moment correlations 
between fluency and originality for 
the pooled control groups and the 
four experimental groups here con- 
sidered were high (+.90 and +.76, re- 
spectively). Although the drop in cor- 
relation is significant, suggesting a 
training effect of separating the two 


TABLE 3 
Anatysts or Variance on PTT Sconzs 
Source 
MO XOT. 
df | MS F 
Experiment I 
Instructions (A) 1 0 à 
Heuristies (B) 1 .16 | 4.00 
AXB T 0 E 
Error 84 .04 
Experiment II 
Instructions (A) 1 .06 | 1.00 
Word Association i 
(B) 2 | .34] 5.07" 
AXB 2 .015| — 
Error 80 


Note.—This analysis was done by the 
method for unequal Ns presented by Walker 
and Lev, 1953, pp. 381-382. 

* p< 05. 

** p « 01. 
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measures, the question posed above 
cannot be answered satisfactorily 
from these data. The possible contri- 
bution of fluency to performance will 
remain an open question, particularly 
on a test such as the UUT which does 
not permit fluency measurement, un- 
less future designs provide a suitable 
measure and control for fluency. 


Discussion 


The chief hypothesis stated that 
task-relevant principles exist for crea- 
tive thinking tasks, and that when 
made available to Ss these principles 
will have a facilitating effect on per- 
formance. That strategies are relevant 
to performance on the UUT is indi- 
cated by the report of controls that 
they used strategies for many of their 
uses. However, the finding that heu- 
ristics-trained groups used strategies 
successfully significantly more than 
controls suggests that strategies are 
employed in a more intuitive, random 
fashion prior to training, and that af- 
ter training strategies are used more 
deliberately and efficiently. 

The findings on the second hypothe- 
sized effects confirm the expectation 
that word-association training can en- 
hance performance, but they yield no 
evidence of greater effects as a func- 
tion of length of training. Changes 
made in the design were such that we 
believed that a better test of the 
length of training variable would re- 
sult; consequently we have little con- 
fidence in this variable. Maltzman's 
theory that word-association training 
Produced mediated generalization 
from uncommon responses in one hier- 
archy to those in another is further 
attenuated by the observation that Ss 
often employed strategies, since it 
Would be difficult to explain the si- 
Tultaneous occurrence of strategies 
and mediated generalization. Chiefly, 
however, the evidence supplied by 
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Gallup (1963) is damaging to an S-R 
interpretation. If there are grounds 
for Gallup’s suspicion that word asso- 
ciation produced a set to respond 
divergently, then different responses 
could be predicted without using S-R 
principles at all. 

That heuristics training involved 
more than strategies is evident from 
the facilitation of the PTT, a task 
for which strategies are not relevant. 
It may be that training of any kind 
will at least partially involve pro- 
ducing a differential in set. Word 
association is not necessarily an es- 
pecially efficient method of achieving 
such a set. 

The findings with respect to the 
other hypothesized effects add to our 
picture somewhat. Instructions to be 
original evidently were perceived as 
specific to the UUT. Although instruc- 
tions quite possibly contribute to a 
set to respond divergently, it is more 
important to notice that a person can 
adjust his performance to be more 
original when it is clear that original- 
ity is called for, Significant differences 
in the degree of association between 
measures of originality and fluency, 
for experimental versus control Ss, 
suggests that a set produced by either 
type of training makes Ss use their 
resources more efficiently. 

In summary, we would say that Ss 
acquired a set to respond according to 
the instructions and requirements of 
the task, and often used relevant 
principles to so respond. These princi- 
ples ean be made more accessible to 
conscious, systematie use if they are 
unambiguously stated and illustrated 
for the S. This was accomplished by 
heuristics treatments, while word-as- 
sociation treatments primarily in- 
fluenced set and left the use of princi- 
ples random and inexplicit. Influence 
on set, however, may make Ss more 
efficient in fulfilling the task. These 
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effects of set are not specific to a task, 
but they are probably of short-term 
effectiveness. 

The more important question to be 
faced is whether we have facilitated 
originality at all. Although it is absurd 
to claim that we have influenced crea- 
tivity in 20 minutes of training, our 
results may point to an operational 
method of actually training origi- 
nality. This method of training would 
confront Ss with principles for several 
tasks, by way of inducing Ss to search 
for strategies on new tasks without 
training. The first step would be to 
formulate strategies for several tasks. 
If strategies are relevant to the post- 
test in this training, we would expect 
Ss to be more sensitive to their rele- 
vance than controls. 

This line of thought suggests that 
we try conceiving of the creative in- 
dividual as one who is able, when 
faced with a particular creative think- 
ing task, to find the relevant principles 
and apply them. The primary appeal 
of this conception seems to be that 
the criterion of creativity is linked 
firmly with performance, yet preserves 
an important tie with the actual proc- 
ess of creation. The hypothesis of 
deeper levels of the creative process, 
as stated by Gordon (1961) is alto- 
gether plausible, but perhaps better 
set aside at this point until operational 
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training has been explored as far as 
possible. 
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76 Air Force ROTC cadets participated in an investigation of the 
effects on learner performance of variations in (a) knowledge of results 
for responses to mastery items inserted in instructional material and 
(b) reinforcement contingencies for performance on a 100-item final 
criterion test over the material, Ss receiving no knowledge of results 
for mastery-item responses scored significantly higher (p < .001) on 
the 11 mastery tests than did Ss receiving immediate feedback through 
chemically treated answer sheets. However, the 2 groups did not 
differ significantly in criterion-test performance. % of the Ss could 
earn $4 for a criterion-test score of 80% or higher, while the remaining 
Ss earned $2.50 irrespective of criterion-test performance. No signifi- 
cant differences in test scores were associated with these reinforcement 
conditions, The data suggest that Ss receiving immediate feedback 
employ a markedly different strategy from “no feedback” Ss in learn- 


ing instructional material. 


The facilitating effect of reinforce- 
ment, on student learning is well rec- 
ognized. Yet in studies of applied 
learning the term "reinforcement" fre- 
quently has been used to describe a 
variety of stimulus conditions without 
specifying the possible differential ef- 
fects related to these diverse condi- 
tions. For example, in many studies 
knowledge of results for student re- 
Sponses to en route test items over the 
instructional materials has been 
treated as reinforcement, Reinforce- 
Ment in these studies is a condition 
that is intrinsic to (i.e., built into) the 
learning materials, In other studies re- 
inforcement is a stimulus condition ex- 
trinsic to the learning materials. Here, 
the presentation of reinforcement is 
Contingent upon level of performance 
9n the instructional material, but the 
reinforcement is not a condition built 
directly into the material. Gold stars, 
Course grades, and special awards for 
eee 

*This research was supported under Con- 
tract AF 33 (615) 1507 with the Aerospace 
Medical Research Laboratories, Air Force 
Systems Command, Wright-Patterson Air 
Force Base, Ohio. 


performance serve as examples of ex- 
trinsic reinforcement. 

The present study sought to investi- 
gate the effects of both intrinsic and 
extrinsic reinforcement on student per- 
formance when both conditions were 
employed in the same instructional 
program. The intrinsic reinforcement 
condition included variations in the 
knowledge of results provided for stu- 
dent responses to sets of mastery items 
inserted at various points in the in- 
structional material. The extrinsic re- 
inforcement condition consisted of 
variations in the amount of money 
that could be earned by subjects (Ss) 
for acceptable performance on a final 
criterion test over the material. 


MerHop 


Subjects 


The Ss were 76 Air Force Reserve ie 
Training Co cadets who were secon 
RUE. tegen enrolled in the AFROTC 
program at Arizona State University. The Ss 
were selected at random from among ap- 
proximately 100 cadets who volunteered to 
participate in the experiment during the 
time when they normally attended 
AFROTC classes and drill periods. 
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Design 

The study employed a 2 X 2 factorial 
design. The variations in intrinsic rein- 
forcement conditions served as one factor 
in the design. The variations in extrinsic 
conditions served as the second factor. 


Procedures 


The Ss attended four 50-minute class pe- 
riods on a twice-a-week basis to read the 
instructional material and answer the 
mastery items. Each S was randomly as- 
signed either to a room in which knowledge 
of results was provided for responses to 
mastery items or to a room in which no 
knowledge of results was provided. The ex- 
trinsic reinforcement conditions were ran- 
domized within both the “feedback” and 
“no feedback” rooms. Printed instructions 
explaining the appropriate extrinsic rein- 
forcement contingency and specifying the 
time schedule for the experiment were given 
to each S at the beginning of the first class 
meeting. The final criterion test was ad- 
ministered to all Ss at a fifth session 2 days 
after the final instructional period. 

The Ss read the instructional material 
and responded to the mastery tests at their 
own pace. The “feedback” group received 
immediate feedback on their responses to 
each mastery item as a function of chemi- 
cally treated answer blanks. Individuals in 
this group used special pens to mark their 
responses to the mastery items. When S 
marked the correct response blank, the blank 
turned red; when he marked an incorrect 
blank, it turned yellow. The Ss in this group 
were told that if their first response to an 
item was incorrect, they were to continue 
responding to the item until they answered it 
correctly. The “no feedback” group received 
no knowledge of results on their responses 
to the mastery items. 

Two levels of extrinsic reinforcement were 
included in the study. Instructions for Ss 
under the Contingent Reinforcement condi- 
tion stated that S would be paid $4.00 if he 
scored 80% or higher on the final test over 
the instructional material, $2.00 if he scored 
from 50% to 79%, and nothing if he scored 
below 50%. Individuals in the Assured Re- 
inforcement group were told that they 
would be paid $2.50 if they attended all five 
scheduled sessions. Instructions to the As- 
sured Reinforcement group stated that there 
would be a final test over the instructional 
material, but the instructions did not relate 
the test to the $2.50 in any way. Thus, no 
extrinsic reinforcement based upon level of 
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performance on the criterion test was avail- 
able to the Assured Reinforcement group. 


Materials 


The textual material used in the experi- 
ment was a revised edition of the Air Force 
manual, The Military Justice System 
(United States Air Force Reserve Officers 
Training Corps, 1962). This text was origi- 
nally selected for revision because it included 
sufficient concept complexity and develop- 
mental continuity directly applicable to Air 
Force and ROTC curriculum. After subject- 
ing the original version of the text to a logi- 
cal analysis of objectives, a final list of 69 
specific behavioral objectives was compiled. 
The Instructional Specification strategy 
(Schutz, Baker, & Gerlach, 1964; Schutz, 
Baker, & Sullivan, 1966) was then employed 
to specify the stimulus conditions required 
for the attainment of these objectives. The 
materials were revised on the basis of 
these specified conditions, item analyses of 
the performance of cadets from earlier 
studies on the final criterion test, analyses of 
interview data from individual cadets who 
had read the materials, and application of 
the gap and mastery principles of pro- 
grammed instruction (Silberman, Coulson, 
Melaragno, & Newmark, 1964). 

A set of 131 mastery items was developed 
pertaining to the behavioral objectives speci- 
fied for the instructional materials. These 
items were grouped into 11 unit-mastery 
tests and inserted at appropriate points in 
the 60-page revised text. Each mastery test 
contained items covering only the materi! 
from the section of the text immediately 
preceding the test. Determination of the 
place in the text where each mastery test 
was inserted was a function of optimum 
length or logical determination of appropri- 
ate homogeneous blocks of content. 

A final criterion test of 100 three- and 
four-choice multiple-choice items was de- 
veloped from an original pool of 200 items. 
Only items with a difficulty index of .75 or 
lower for a sample of 50 Ss who had read 
the original text were included in the final 
100-item test. The reliability coefficient for 
the criterion test, computed by the KR-20 
formula on a sample of 76 Ss who had re! 
the revised text, was 86. 


RESULTS 


The mean score for each group 0n 
the 100-item criterion test is shown 2 
Table 1. It is apparent from the table 
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that the Contingent Reinforcement 
group scored approximately three 
points higher than the Assured Rein- 
forcement group, and the “no feed- 
back" group scored slightly more than 
one point higher than the “feedback” 
group. These differences were tested 
for significance using a two-way anal- 
ysis of variance. Neither the main ef- 
fects nor the interaction was statisti- 
cally significant. 

Analyses of performance on the 
mastery tests, however, did reveal im- 
portant differences between the treat- 
ment groups. Only three of the 38 Ss 
in the feedback" group failed to com- 
plete all 11 mastery tests during the 
four periods of the instructional pro- 
gram. In the "no feedback" treatment, 
however, 15 of the 38 individuals com- 
pleted only 10 or fewer tests and failed 
to reach the final mastery test. 
Clearly, the “no feedback" group was 
Spending more time studying either 
the textual material or the mastery 
items than was the “feedback” group. 

Descriptive statistics relating to the 
mean standard score on unit-mastery 
tests completed by each S are shown in 
Table 2. Since data on mastery-test 
Scores were not available on four Ss 
from the contingent-reinforcement- 
plus-feedback cell when the statistical 
analyses were performed, these Ss 
Were assigned the computed mean 


TABLE 1 
CRITERION Tgsr MEAN Scores 


Intrinsic 
reinforcement 
Totals 


Extrinsic rei; 
sic reinforcement Feedback ud 


IN | Score |N | Score |N | Score 


Contingent ($4- —|19061.37/1961.32/38/61.35 
2-0 
19/57.00/19/59.63/38/58.32 


Assured ($2.50) 
|38/59.18,38/60.47/76/59.83 


Totals 
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TABLE 2 
Mastery Tus MEAN STANDARD Scorzs 


Intrinsic 
reinforcement 
Totals 


Extrinsic reinforcement |Feedback fedi 


Stand-| |Stand-| 
ard |N | ard 


N N | Score 

score score 
Contingent 19/47 . 10/19/53 . 24/38/50. 17 
Assured 19/47.07/19/51.97/3849.52 
Totals 38/47 .09/3852.01|7649 85 


standard score for their cell on mas- 
tery-test performance. This accounts 
for the slight variation of the grand 
mean score (49.85) from a grand 
mean of 50.00. 

The data in Table 2 reveal that the 
“no feedback” Ss performed consid- 
erably better on the mastery items 
than did the “feedback” Ss. The mean 
standard score for mastery-test per- 
formance is 5.52 standard score points 
higher for the “no feedback" group. 
A two-way analysis of variance of 
mastery-test scores revealed that this 
difference is significant at the .001 
level (F = 22.08). Neither the extrin- 
sie reinforcement contingency effect 
nor the interaction between intrinsic 


and extrinsic reinforcement ap- 
proached statistical significance, 
Discussion 


The results of the study suggest that 
there were important differences be- 
tween the “feedback” and “no feed- 
back” treatment groups in the strate- 
gies that they employed to learn the 
instructional material. The better per- 
formance of the “no feedback” group 
on the mastery tests and the failure 
of 15 Ss from this group to finish the 
instructional program indicate that 
the “no feedback” Ss expended more 
time and effort attempting to learn 
from the prose textual material. For 


168 


these individuals, of course, this is the 
only instructional material in the text. 
The “feedback” Ss, on the other hand, 
apparently neglected the textual ma- 
terial to some degree and used the in- 
structional value of the immediate 
feedback to their mastery-item re- 
sponses. Such a procedure would ac- 
count for their greater speed in work- 
ing through the textual material and 
their inferior performance on the mas- 
tery items. That the “feedback” Ss 
were successful in learning from the 
immediate feedback to their mastery- 
test responses is demonstrated by their 
subsequent performance on the cri- 
terion test. Their criterion-test per- 
formance was comparable to that of 
the “no feedback” group, even though 
their mastery-test performance was 
significantly inferior. 

An interesting phenomenon to note 
here is the apparent sensitivity of the 
learner to subtle procedural cues im- 
plicit in the instructional material. 
For example, one might predict that 
both the “feedback” and “no feed- 
back” groups would study equally 
hard on the textual material and that 
no difference would occur in mastery- 
test performance between the two 
groups. Since feedback is not received 
until after the learner responds to a 
mastery item, one would expect that 
on subsequent items over the same 
material (e.g. the criterion test) the 
feedback would result in an advantage 
for the individuals receiving it. How- 
ever, it appears that “feedback” Ss 
quickly observe that they need not 
labor over the textual materials to 
learn the material to be covered on the 
criterion test. Where individuals in the 
“no feedback” group may choose to 
look on the preceding pages for the 
correct answer to a puzzling mastery 
item, an S in the “feedback” group 
can employ the easier and simpler ex- 
pedient of marking in succession his 
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highest-order response choice until the 
feedback indicates a correct response, 

How can one capitalize upon the 
advantages of the intrinsic reinforce- 
ment involved in the immediate feed- 
back procedure on the mastery items 
while at the same time maintaining the 
control of the textual material over the 
reader's learning? One possible pro- 
cedure would be to provide the learner 
with extrinsic reinforcement for ac- 
ceptable performance on the mastery 
items, as well as for acceptable per- 
formance on the criterion test. Thus, 
performance of the “feedback” Ss on 
the mastery test should be improved 
because of the extrinsic reinforcement 
associated with good mastery perform- 
ance. The immediate feedback on 
these items should still serve to facili- 
tate subsequent performance on the 
criterion test. 

A final word should be said con- 
cerning the effect of extrinsic rein- 
forcement in using instructional ma- 
terials. The differences in the levels of 
the monetary reinforcement contin- 
gency employed in the present study 
were not of sufficient strength to sig- 
nificantly affect student performance. 
However, there is little doubt that 
extrinsic reinforcement is required in 
the learning task to maintain control 
of the instructional material over stu- 
dent responses. In the classroom set- 
ting such teacher strategies as the use 
of appropriate verbal statements 
(praise, encouragement, exhortation, 
ete.), assignment of perseverant stu- 
dents to preferred activities, and the 
permission of free choice of student 
activity upon completion of assigned 
work may be employed to develop 
and maintain desired learner responses 
to learning materials. The use of these 
strategies and other effective proce- 
dures by the classroom teacher is an 
essential technique for maximizing the 
effectiveness of instructional materials. 
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No matter how excellent the quality 
of the material, the student will not 
learn it well unless he is provided with 
an incentive for doing so. 
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PREDICTION OF ADOLESCENT POPULARITY AND 
REJECTION FROM ACHIEVEMENT AND 
INTEREST TESTS 


HERBERT HOROWITZ? 
University of Pittsburgh 


15 Project TALENT scores for 1,437 male and 1,505 female high 
school students were correlated with popularity and rejection scores 
achieved with members of their own and the opposite sex. It was 
found that: (a) multiple correlations enhance considerably the pre- 
dictability of such sociometric choices; (b) generally, popularity cri- 
teria were more correlated with predictors than rejection criteria; (c) 
best predictors of both popularity and rejection were English test 
total, information about and interest in sports, and socioecomonic 
status; and (d) some variables seem to relate only to popularity or 
only to rejection. The implications of the latter finding for use of 
multivariate analytic methods in isolating clusters of variables asso- 
ciated with popularity and/or rejection were discussed briefly. 


In the 35 years since its introduc- 
tion by Moreno, the technique of 
sociometric measurement has come 
to be used widely for studying inter- 
personal choice. A major trend in this 
research has been the effort to isolate 
factors associated with popularity and 
rejection among children and adoles- 
cents. Summaries of the research find- 
ing in this area, as well as in other 
areas of sociometric measurement, are 
contained in the reviews of Lindzey 
and Borgatta (1954) and in The 
Sociometry Reader (Moreno et al., 
1960). To cite but a few of the large 
body of related findings, it has been 
found, for example, that popularity is 
positively associated with athletic 
ability and interest (Feinberg; 1953; 
Feinberg, Smith, & Schmidt, 1958; 
Krumboltz, Christal, & Ward, 1959; 
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McGraw & Tolbert, 1953), positively 
associated with sociability and social 
adjustment (Bonney, 1946; Feinberg, 
1953; Feinberg et al., 1958; Kuhlen 
& Bretsch, 1960), and negatively as- 
sociated with school newspaper activ- 
ity (Krumboltz et al., 1959). 

The present study sought to extend 
these findings using multiple rather 
than single criteria typical of previous 
work in the area and using a larger 
and more representative sample of 
adolescents than has been used previ- 
ously. 


METHOD 


Description of the Sample 


The sociometric data for the present 
study were collected in conjunction with the 
high school testing program constituting 
Project TALENT (Flanagan, Davis, Dailey, 
Shaycoft, Gorham, Orr, & Goldberg, 1964). 
The sociometric data came from a subsample 
of eight schools selected from the 1,353 
schools constituting the TALENT sample. 
The eight schools were chosen ome 
generalizability to the national high schoo! 
age group. Each school in the sample was 
representative of one of the eight od 
nental) regional areas defined by the Unit 
States Office of Education. In selecting 
schools for the subsample, however, an OF 
tempt was made to limit the variability © 
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the size of the schools. Thus, the number 
of students in each grade ranged only from 
57 to 161. Summing across all grades and all 
schools, the sample contained 1,437 males 
and 1,505 females. For all of these individ- 
uals sociometric data as well as regular 
Project TALENT testing data were avail- 
able. 


The Sociometric Criterion Scores 


Within each of the four grades, each male 
and female student was asked to list the 
three boys within their own grade they most 
"like to be with" and, in addition, the three 
boys they least “like to be with." Each stu- 
dent, was then required to give the same six 
nominations (three positive and three nega- 
tive) for the female students in his or her 
class Thus, four kinds of sociometric scores 
were obtained for each student. 

1. Same-Sex Attraction Score (SS-A): 
the number of times the individual was nom- 
inated as liked by members of his (her) own 
sex, weighted by dividing by the number of 
same-sex students in that grade. 

2. Opposite-Sex Attraction Score (OS-A): 
the number of times the individual was 
nominated as liked by members of the op- 
Posite sex, weighted by dividing by the 
number of opposite-sex students in that 
grade, 

8. Same-Sex Rejection Score (SS-R): the 
number of times the individual was nomi- 
nated as disliked by members of the same 
sex, weighted by dividing by the number of 
same-sex students in that grade. 

4. Opposite-Sex Rejection Score (OS-R): 
the number of times the individual was 
nominated as disliked by members of the 
Opposite sex, weighted by dividing by the 
number of opposite-sex students in that 
grade. 

The actual weights used in computing 

ese scores varied from the low 20's to the 
upper 80's. From this it can be seen that the 
Social groups in which these scores were 
taken, while variable, were not very large. 

, Product-moment correlations among these 
Criteria in each sex showed the two kinds 
of attraction scores (SS-A and OS-A) to be 
highly positively correlated, as were the two 
kinds of rejection scores (SS-R and OS-R). 
Furthermore, correlations between attrac- 
tion and rejection scores, while significantly 
different from zero, were much lower and 
negative. Factor analyses* of these criteria 
——- 


. ` Principal components solution with units 
in the diagonal followed by normalized vari- 
max rotation. 
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in each sex (reported in Horowitz, 1966) 
yielded two independent factors, readily 
identified as “attraction” (same-sex and 
opposite-sex) and “rejection” (same-sex and 
opposite-sex), These factors accounted for 
approximately 80% of the total variance in 
the criterion scores. Nevertheless, it was 
decided to use the four criterion scores 
separately in the present analyses since it 
was possible to obtain variations between the 
two kinds of attraction criteria or between 
the rejection criteria in their relationships to 
the predictors. 


The Predictors 


In all, 15 Project TALENT variables 
were used as predictors.‘ These variables 
(listed in Table 1) were the ones remaining 
after preliminary analyses with a larger 
group of predictor variables resulted in 
elimination of variables whose relations to 
the criteria had been shown to be weak or 
inconsistent. Although such variable-selec- 
tion procedures have been employed pre- 
viously (e.g., Krumboltz, Christal, & Ward, 
1959), there is always the danger of in- 
advertently capitalizing on chance rela- 
tionships in the data. However, this does not 
appear to be the case here. In the original 
matrices of correlation of 56 predictors and 
4 criteria (sexes combined in these matrices), 
158 (70%) of the 224 correlations were sig- 
nificant at the p = .05 level, as compared 
with 11 correlations which would be signifi- 
cant if only chance conditions were op- 
erating.” Thus, it is clear that many variables 
other than the 15 selected were related to the 
criteria. Although the original analyses were 
done with combined sexes, the analyses to be 
reported here were done separately for each 
sex. In this way, it was possible to study the 
replicability of the findings across sex. 

In the analysis each of the 15 predictors 
was correlated individually with the four cri- 
teria. In addition, multiple correlations were 
performed with composite sets of the 15 
predictors. 


RESULTS 
The upper portion of Table 1 shows 
the intercorrelations among 15 predic- 


‘Standardization and other normative 
data about all 15 predictors (except the so- 
cioeconomic index) are contained in Flana- 
gan, Davis, Dailey, Shaycoft, Orr, Goldberg, 
and Newman, 1964. 

With samples of this size, rs as small 
as +.06 are significantly different from zero 
at the p = .05 level. 
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TABLE 1 


Propuct-MomEnt AND MULTIPLE CORRELATIONS AMONG FIFTEEN PROJECT 
TALENT PREDICTOR VARIABLES AND Four Socromerric CRITERIA 


BY Sex 
Intercorrelations 
Males Females 
Predictors 
Attractiveness Rejection Attractiveness Rejection 

SS-A OS-A SS-R OS-R SS-A OS-A SS-R OS-R 

Literature information .12 .14 | —.01 | —.06 .06 .08 | —.09 | —.07 
Music information .08 .19 | —.01 | —.08 11 .12 | —.12 | —.12 
Scientific attitude informa- .12 -14 | —.02 | —.06 ll -16 | —.14 | —.14 

tion 

Sports information .22 .27 | 7.11 | —.15 .09 .13 | —.09 | —.10 
Sociability scale Ai .22| —.04 | —.06 .15 .18 .03 | —.03 
Impulsiveness scale .01 .00 .09 .07 | —.00 .04 .10 .06 
Leadership scale .16 .20 .02 | —.01 .15 .22 .03 | —.01 
Literary-linguist interest .06 .14 .06 .02 .04 .11 | —.02 | —.03 
Social service interest .04 -12 | —.01 | —.06 .10 .10 | —.08 | —.08 
Musical interest .02 B .09 .03 .04 .04 | —.01 .01 
Sports interest .18 17 | —.15 | —.19 .12 .15 | —.04 | —.09 
Outdoors interest .01 | —.03 | —.06 | —.12 .03 .07 | —.00 | —.02 
Labor interest —.08 | —.13 .01 .01 | —.05 | —.07 .02 04 
English test total .16 .21 | —.09 | —.12 .18 .16 | —.18 | —.18 
Socioeconomic level index wit oT .08 | —.04 .12 .19 | —.08 | —.12 
Corrected Raun .81*| .39*| .25*| .27*| .20*| .34*^ .20* ^ .20* 


Note.—For samples of this size to be significant at the p = .01 level, r must be greater 
than or equal to .08. For males, N = 1,437; for females, N = 1,505. 


*p< 0l. 


tors and 4 criteria, It can be observed 
that, in absolute terms, the correla- 
tions are small, none of them exceeding 
+ .30, and many do not exceed + .10. 
Second, the criteria of attractive- 
ness (SS-A and OS-A) are generally 
more strongly correlated with the pre- 
dictors than the corresponding criteria 
of rejection, This is particularly true 
for the males. A third point to note is 
that the correlations show a reflection 
in sign from those in the attractiveness 
columns (usually positive) to those in 
the rejection columns (usually nega- 
tive). 

The bottom row of Table 1 shows 
the eight multiple correlations between 
each of the criteria and the 15 pre- 
dictors. All of these correlations are 
statistically significant and account 


for at least twice as much variance as 
the largest bivariate correlations be- 
tween that criterion and any of the 
predictors. That is, there is considera- 
ble enhancement of predictability of 
both popularity and rejection criteria 
through multiple correlation tech- 
nique. 

The following variables were the 
best predictors of popularity in both 
sexes: the English test total, interest 
in sports, and the self-rating personal- 
ity scales of sociability and leadership. 
A particularly important additional 
variable for predicting males’ popu- 
larity is their knowledge of sports 
(sports information test). 

The results for rejection are, in some 
respects, the reverse of those for 
popularity. For both males and fe- 
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males, English test total and informa- 
tion about and interest in sports are 
negatively related to rejection; and, 
for females, the socioeconomic vari- 
able predicts both popularity (posi- 
tive association) and rejection (nega- 
tive association). From these results 
it is clear that some overlap exists be- 
tween factors associated with popu- 
larity and those accounting for rejec- 
tion. To see how much overlap exists, 
the obtained correlations for attrac- 
tiveness were correlated with each 
other and with those for rejection. In 
doing this, the algebraic sums of the 
two product-moment correlations for 
each predictor in each adjacent pair 
of columns were obtained (SS-A + 
OS-A and SS-R + OS-R). Then these 
four sets of 15 sums of correlations 
were themselves correlated. These rs 
are shown in Table 2. The table shows 
that, within a given kind of affect, 
cross-sex correlations of the sums of 
original correlations are positive and 
high (particularly for attractiveness). 
In this way, these results provide some 
proof of replicability of the findings 
for independent subsamples. However, 


TABLE 2 
INTERCORRELATIONS OF Four SETS OF 
FIFTEEN CORRELATION COEFFICIENTS 

BETWEEN FrrTEEN Prosect TALENT 
PREDICTORS AND SOCIOMETRIC 
CRITERIA 


Attractiveness Rejection 


Males | Females | Males |Females 


Attractiveness 
ales — | +.87| —.70) —.62 
Females — — |-—.88| —.56 
Rejection 
Males — — — | +.64 


, Note.—All correlations in this table are 
Significantly different from zero at or be- 
yond the p = .05 level. See text for expla- 
nation of derivation of correlations reported 
in this table. 
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more importantly, the correlations be- 
tween attractiveness and rejection 
(both within and across sexes) are also 
high and, predictably, negative. Thus, 
these data show clearly that the pat- 
terns of association among predictors 
and criteria are quite consistent, that 
is, that the relative power of different 
predictors remains quite stable with 
variations in sex of S and nature of 
affect. 


Discussion 


The present data delineate a pat- 
tern of interpersonal values in the 
adolescent world of American high 
schools which is similar to the one 
described by Coleman in his extensive 
study of The Adolescent Society 
(Coleman, 1961). One of the major 
findings of that report was that, al- 
though athletes were chosen more 
frequently (by several criteria of in- 
terpersonal popularity) than scholars, 
the “athlete-scholars” were the most 
popular of all, suggesting positive 
valuation of both academic and ath- 
letic achievements. The present results 
support that finding in showing strong 
relationships between, on the one 
hand, popularity and rejection cri- 
teria and, on the other hand, interest 
and achievement predictors from both 
the intellectual and athletic domains. 

While the present study supports 
previous investigations in showing a 
multiplicity of associations between 
sociometrie choices and other factors, 
clearly the variables were not equally 
successful as predictors of popularity 
and rejection. While some overlap 
exists among variables associated with 
popularity and those associated with 
rejection, these variables (English 
test total, sports information, sports 
interest, socioeconomic index) are a 
minority of the complete set of 15. 
The finding that relatively few vari- 
ables relate to both popularity and 
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rejection is meaningful in light of the 
results obtained with the criterion 
score analyses (Horowitz, 1966). The 
fact that the criteria separated into 
two factors, “attraction” and “rejec- 
tion,” suggests that predictors re- 
lated to attractiveness might be differ- 
ent from those related to rejection. 
Support for this expectation was found 
in the present data, in spite of the 
significantly consistent patterns of 
predietion which were also found 
(Table 2). For instance, leadership 
and sociability personality scales have 
fairly strong positive associations 
with popularity, but, with one excep- 
tion, statistically nonsignificant rela- 
tionships to rejection. Similarly, while 
labor interest was negatively related 
to attractiveness, it had only nonsig- 
nificant correlations with rejection. 
Finally, at least one of the 15 varia- 
bles, the impulsiveness scale, seemed 
to be associated only with rejection. 
Thus, the present data, when con- 
sidered alongside the results of the 
criterion analysis, suggest that there 
may be three clusters of psychometric 
variables which predict popularity and 
rejection: those associated only with 
popularity, those associated only with 
rejection, and those associated (to a 
lesser degree) with both popularity 
and rejection. The ultimate test of this 
hypothesis would require a multivari- 
ate analytic procedure with a differ- 
ent, and probably much larger, set of 
predictors which had previously been 
shown to be related to popularity 
and/or rejection. Sueh a procedure 
would not only increase the total vari- 
ance which multiple correlations could 
explain; it would also reveal the par- 
ticular constellations of variables 
which are associated, in the first place, 
with popularity; second, with rejec- 


tion; and, finally, with both popular- 
ity and rejection. 
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TEACHERS' ATTITUDES TOWARD CHILDREN'S 


BEHAVIOR REVISITED 
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This study concerned itself with the evaluations of a wide range of 
child behavior by 118 elementary school teachers and 23 clinical psy- 
chologists. The Staten Island Behavior Scale was administered to 
these two groups with the instruction that each of the 295 behavioral 
items be rated as indicating either normal or abnormal behavior. The 
items were also grouped into 8 classifications on the basis of a con- 
census by a panel of judges. Teachers were found to differ from psy- 
chologists in their attitudes toward child behavior on a significant 22% 
of the items, mostly falling in the categories aggressive behavior, re- 
gressive behavior, and affect expression. The teachers tended to regard 
almost all of the differentiating behavioral descriptions abnormal 
whereas the psychologists perceived them to be normal. Teacher ex- 
perience was found to be a significant, variable with inexperienced 
teachers differing more from the psychologists, and ascribing more 
pathology to a variety of child behavior, than experienced teachers. 
The findings were related to the “continuity hypothesis” and to the 


difference in roles between teachers and clinicians. 


A classic study by E. K. Wickman 
(1928) found a great discrepancy be- 
tween the views of teachers and men- 
tal health workers toward the be- 
havior problems of children. Although 
this study has been criticized on 
methodological grounds, its influence 
on American education has been pro- 
found. Several more recent investiga- 
tions (Griffiths, 1952; Schrupp & 
Gjerde, 1953; Stouffer, 1952) suggest 
that while there has been considerable 
change in the attitudes of teachers to 
make them more congruent with those 
of clinicians, marked differences be- 
tween the two groups continue to ex- 
ist. These differences are still in the 
direction of teachers being more con- 
cerned with management, sexual ad- 
Justment, and adherence to authority 
problems whereas the mental health 
professionals are more sensitive to 
withdrawal behavior and behavior 
hot directly related to the school rou- 
tine but suggesting a deterioration of 
social or emotional patterns. 


The purpose of this study was to 
explore further the relationship be- 
tween the evaluations by teachers and 
psychologists of a wide range of child 
behavior patterns. More specifically, it 
was anticipated that by employing a 
comprehensive scale of unambiguous 
behavioral items, which can be 
grouped on an empirical or theoretical 
basis to focus on patterns of function- 
ing, it would be possible to identify 
the types of behavioral patterns that 
teachers and psychologists perceive 
most differently. The effect that the 
teachers’ experience level has on the 
ratings was also to be determined. 


MetHop 


The teacher respondents consisted of 90 
female and 28 male elementary public 
school teachers randomly selected from a 
large urban school system. "They were drawn 
from all grades ranging from the kinder- 
garten to the seventh grade level, inclusively. 
There were 9 teachers at the kindergarten 
level, 13 at Grade 1, 16 at Grade 2, 15 at 
Grade 3, 17 at Grade 4, 16 at Grade 5, 15 
at Grade 6, and 17 at Grade 7. The profes- 
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sional experience of the teachers encom- 
passed the range of less than 1 year to 44 
years (mean 119 years). 

The psychologist respondents consisted 
of 17 males and 6 females, all of whom were 
functioning in clinical settings in the same 
state as the teachers. The highest degree 
held by the psychologists was the Ph.D. for 
15 and the M.A. or MS. by eight. In ex- 
perience the psychologists ranged from under 
one year to over 30 years (mean 10.7 years). 

The measuring device was the Staten Is- 
land Behavior Scale (Mandell & Silberstein, 
1965) which consists of 295 items descriptive 
of children’s behavior. The items were origi- 
nally selected from published and unpub- 
lished scales used to evaluate children’s ad- 
justment and from an analysis of a large 
number of case records in the files of a child 
guidance clinic. 

The items were classified for the purposes 
of the present study by six raters (5 ad- 
vanced students and 1 Diplomate in Clinical 
Psychology), making independent judg- 
ments, into the following classifications: 
psychosomatic and physical disturbance (71 
items); phobias (18 items); aggressiveness 
(56 items); affect expression (58 items); 
communication disturbance (21 items); re- 
gressive behavior (15 items) ; inefficiency in- 
dicators (25 items); and fantasy involve- 
ment or withdrawal (31 items). 

Each item was placed into the category 
that represented the rating of the majority 
of the judges. An indication of the high de- 
gree of interrater agreement is provided by 
the fact that on 203 of the 295 items (69%) 
at least five of the raters were in complete 
agreement in regard to the classification of 
an item. 

Illustrations of the different types of items 
are the following: For the psychosomatic 
and physical disturbance classification—"Is 
slow in his movements”; for the phobic 
classification—“Is afraid of being alone in 
a wide open space”; for aggressiveness— 
“Hits or attacks other child”; for affect ex- 
pression—“Shows inappropriate feeling"; for 
communication  disturbanee—"Talks and 
talks"; for regressive behavior—"Carries 
blanket"; for inefficiency indicators—“Does 
not complete his chores”; and for fantasy 
involvement or withdrawal—“Doesn’t join 
in competitive games.” 

The written instructions accompanying 
the administration of the scale are presented 
below: 


For each of the following items please 
indicate whether the behavior in question, 
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in your opinion, indicates normal or ab- 

normal behavior in a child falling in the 

age range from 1 to 16 inclusive. 

Please answer all items without omit- 
ting any, and try to check either the ^Nor- 
mal" or "Abnormal" category. In the 
event that you really cannot decide 
whether the behavior is normal or ab- 
normal, you may check the “Unknown” 
line. However, you will probably be able 
to arrive at a definite decision in all or 
nearly all instances. 

The respondents were not given any time 
limit, but were cautioned against collabo- 
rating with anyone else in completing the 
scale. 

The very broad age range was quite de- 
liberately employed in the instructions since 
our intent was not so much to obtain re- 
actions to a child's behavior at a specific 
point in time—even though we recognized 
that the appropriateness of behavior is age 
related—but more to distill from a less struc- 
tured frame of reference behavior patterns 
that most frequently tend to be regarded as 
being normal or abnormal, even in the ab- 
sence of a specific anchoring point. 


RESULTS AND DISCUSSION 


The main findings indicated that 
teachers and psychologists, when their 
responses for each item on the ques- 
tionnaire are compared by chi-square, 
differ significantly (p < .05) on 66 
of the 295 items, that is, on 22.4% of 
the items, in regard to whether they 
rated the behavior to be normal or ab- 
normal. This number of differentiating 
items is significantly greater than 
would be expected on a chance basis 
alone. Of the 66 critical descriptions of 
behavior, 7, or 11%, are in the psycho- 
somatic and physical disturbance cate- 
gory, 1, or 2%, in the phobias cate- 
gory, 21, or 32%, in the aggressiveness 
category, 19, or 29%, in the affect ex- 
pression category, 3, or 5%, in the 
communication disturbance category, 
6, or 9%, in the regressive behavior 
category, 3, or 5%, in the inefficiency 
indicators category, and 6, or 9%, in 
the fantasy involvement or with- 
drawal category. 
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Since the categories consisted origi- 
nally of unequal numbers of scale 
items, the percentages reported above 
may be somewhat misleading. Another 
way of analyzing the same data is to 
determine the proportion of items 
within each category that differenti- 
ates the teachers' judgments from the 
psychologists’ ratings. Employing this 
approach, we note from Table 1 that 
the greatest disagreement occurs in the 
areas of regressive behavior, aggres- 
siveness, and affect expression. Next in 
order are fantasy involvement or 
withdrawal, communication disturb- 
ance, and inefficiency indicators in 
which areas the two groups are in 
relatively close agreement. In regard 
to phobias and psychosomatic and 
physical disturbance, the judgments of 
psychologists and teachers are very 
much in accord, as can be seen by the 
negligible item disagreement. 

These results indicate, therefore, 
that elementary school teachers in 
general tend to evaluate behavior that 
may be described as regressive, ag- 
gressive, and emotional quite differ- 
ently than do psychologists. In view 
of the fact that nearly all. of the dif- 
ferentiating items, that is, 61 of 66, or 
92.4%, were rated predominantly nor- 
mal by psychologists and abnormal 
by teachers, it is obvious that ele- 
mentary school teachers perceive re- 
gressive, aggressive, and emotional 
behavior to be considerably more 
pathological than do mental health 
professionals, 

Two subgroups of teachers were 
selected based on amounts of teaching 
experience. The Highs consisted of the 
top third in experience of the overall 
group of teachers. The Lows consisted 
of the lowest third in teaching ex- 
perience. The 39 teachers in the High 
subgroup ranged in professional ex- 
Perience from 14 to 44 years with a 
median of 24.5 years; the 39 teachers 
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TABLE 1 
DEGREE or DISAGREEMENT BETWEEN 
TEACHERS AND PSYCHOLOGISTS IN 
Spzcrric SCALE CATEGORIES 


n Percentage 
Category EDO of items 
upon 
Physical-psychoso- 71 10 
matic 
Phobic 18 6 
Aggressive 56 38 
Affect 58 33 
Communication 21 14 
Regressive 15 40 
Inefficiency 25 12 
Fantasy-withdrawal 31 19 


in the Low subgroup ranged in ex- 
perience from less than 1 year to 6 
years with a median of 3 years. 

When the attitudes toward child 
behavior of the Lows are compared 
with those of the psychologists, 83 
items were rated significantly differ- 
ently by the two groups. Eighty of the 
83 items, or 96%, were rated normal 
more often by psychologists than by 
teachers with relatively little ex- 
perience. There was disagreement pri- 
marily in regard to the significance of 
aggressive behavior (57% of the items 
in that category were rated differ- 
ently), regressive behavior (33% of 
the items here were differently rated), 
and affect expressions (31% of the 
items were judged differently). There 
was no difference in the ratings of the 
18 phobic items, and relatively little 
disparity in judgments for inefficient 
behavior (16%) and fantasy-with- 
drawal behavior (16%). Communica- 
tion problems and physical-psycho- 
somatie disturbances produced only 
moderate disagreements (19% and 
21%). 

A similar chi-square analysis was 
done comparing the attitudes of teach- 
ers high in experience with psycholo- 
gists on each of the scale items. In 
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TABLE 2 
DEGREE or DISAGREEMENT BETWEEN 
HIGHLY EXPERIENCED AND RELATIVELY 
INEXPERIENCED TEACHERS IN 
Specrric ScALE CATEGORIES 


F Percentage 
Category ao 
upon 
Physical-psychoso- 71 28 
matic 
Phobic 18 67 
Aggressive 56 32 
Affect 58 28 
Communication 21 48 
Regressive 15 13 
Inefficiency 25 40 
Fantasy-withdrawal 31 26 


this comparison only 45 behavioral de- 
scriptions significantly differentiated 
the groups. Moreover, the patterns of 
disagreements between highly ex- 
perienced teachers and psychologists, 
on the one hand, and less experienced 
teachers and psychologists, on the 
other hand, is very different. For one 
thing, the more experienced teachers 
did not nearly as often differ from psy- 
chologists in the direction of ascrib- 
ing abnormality to a description of 
child behavior as did the less ex- 

erienced teachers. As a matter of fact, 

e differences between the more ex- 
perienced teachers and psychologists 
were likely to be as often in the direc- 
tion of teachers considering the be- 
havior to be benign when psychologists 
regarded it as being pathological as 
it was to be considered pathological 
when the psychologists rated it as be- 
ing normal. (Only 52% of the differ- 
entiating items were rated normal by 
psychologists more often than by 
highly experienced teachers). Second, 
the areas in which the differences be- 
come manifest for the highly ex- 
perienced teachers is very different 
from the Lows. More specifically, the 
Highs do not differ from the psychol- 
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ogists particularly in regard to aggres- 
Sive, regressive, and affect behavior 
as do the Lows. The least disparity 
(0%), as a matter of fact, occurs in 
relation to regressive behavior; the 
greatest discrepancy (2896 of the items 
in the category), occurs with the ra- 
tings of phobie behavior. 

Finally, chi-square analyses of the 
item ratings for teachers high and 
teachers low in experience yielded the 
largest degree of discrepancy of all 
comparisons. Ninety-six, or 32.596 of 
the total number of behavioral de- 
scriptions, were rated significantly dif- 
ferently by these two subgroups. In- 
terestingly, all 96 critical items were 
perceived to be normal more often by 
the highly experienced teachers as 
compared with the less experienced 
teachers. 

Table 2 presents the percentage of 
items within each of the scale cate- 
gories rated differently by highly ex- 
perienced and relatively inexperienced 
teaching personnel. It will be noted 
that phobie behavior tends most often 
to be viewed differently by teachers of 
varying degrees of experience, and that 
there is considerable disagreement 
about behavior involving communica- 
tion facility and efficiency. 

Illustrative of the specific differ- 
ences in ratings between the more ex- 
perienced and the relatively less ex- 
perienced teachers are the following 
items, all of which were regarded to 
be normal by the more experienced 
teachers and abnormal by the less ex- 
perienced teachers: 


Cries or whimpers 

Plays with or fingers his mouth 

Headache 5 

At the slightest upset, coordination be- 
comes poor 

Is frightened in crowds ` 

Is afraid of being alone in a wide open 


space 
Child's thoughts are hard to understand 
Lying e 


SS 
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Lewis (1965), in reviewing the lit- 
erature bearing on the "Continuity 
hypothesis,” which states that “... 
emotional disturbance in a child is 
symptomatic of a continuing psycho- 
logical process that may lead to adult 
mental illness [p. 465],” concluded that 
the acting-out child is more likely to 
become seriously disturbed as an adult 
than the timid, withdrawn child. He 
suggested that perhaps the judgments 
of teachers, as derived from the Wick- 
man (1928) study, represented a more 
accurate appraisal of the pathology of 
children than the evaluations of clini- 
cians, at least when adult psychiatric 
status is taken as the criterion. Ir- 
regardless of the validity of the per- 
ceptions of each of these groups, the 
study of the nature of the attitudes 
remains an important research prob- 
lem since attitudes will influence 
markedly the interactions between the 
child and his teachers. 

Beilin (1959) pointed out cogently 
that the attitudinal patterns of teach- 
ers and clinicians toward adjustment 
difficulties reflect in part their differ- 
ent roles, and that their roles, in turn, 
“influence the organization of their 
respective experiences [p. 22].” Since 
Beilin regards teachers to be essen- 
tially task-oriented, that is, concerned 
with the imparting of information and 
skills, and since mental health pro- 
fessionals are more concerned with 
preventing poor adjustment and pro- 
moting good adjustment, it is not sur- 
prising that these two groups will 
continue to perceive child behavior 
differently. 

The present findings suggest that 
psychologists tend to be more accept- 
ing, or at least more tolerant, of a 
greater variety of child behavior than 
teachers, and tend to regard a wider 
range of behavior as being normal. 
Teachers, especially those who are 
relatively inexperienced, label much 


more behavior as being abnormal. 
Teachers are especially critical of cate- 
gories of behavior that may be re- 
ferred to as aggressive, regressive, and 
emotionally expressive. The fact that 
the greatest degree of disagreement is 
found between experienced and inex- 
perienced teachers reinforces the im- 
pression that actual exposure to child 
behavior is an important determinant 
of attitudes toward pathology. 

The present study also bears on the 
frequently voiced criticism of clini- 
cians as being overly sensitive to the 
pathological aspects of others and not 
sufficiently sensitive to their assets. 
The findings indicate that this criti- 
cism is probably unjustified since the 
clinicians were in fact much less prone 
to interpret behavior as being abnor- 
mal than the teachers. 

Brief reference should be made to 
several methodological limitations. 
First, a number of teachers and psy- 
chologists who were given the Staten 
Island Scale either did not complete 
the form or failed to follow instruc- 
tions and had to be eliminated for that 
reason. Thus, of the original sample of 
145 teachers, only 118 could be em- 
ployed for the analysis. Whether the 
respondents who cooperated differ in 
any essential respect from those who 
did not is not known. Second, although 
some precautions were taken against 
the respondents being influenced by 
others in making their ratings, the 
possibility still remains that some 
judgments were not made entirely 
independently. 

Perhaps a more important problem 
is related to the ambiguous instruc- 
tions provided the subjects. Many re- 
spondents found the task to be ex- 
tremely difficult. A number took great 
pains to comment that since what is 
considered normal and abnormal is 80 
closely related to the age level of the 
child, they could not arrive at a de- 
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cision. Moreover, some stated that 
since the degree, severity, frequency, 
nature of onset, duration, and cireum- 
stances surrounding the appearance of 
the symptom remained unspecified, 
their confidence level in arriving at a 
decision was extremely low. Neverthe- 
less, it should be noted that since both 
the professionals and the teachers were 
faced with the same need to impose 
structure on the scale items, there is 
little likelihood that the ratings reflect 
Systematic response biases that differ 
for the two groups. 

It is suggested that the question of 
whether anchoring the concept of nor- 
mality versus abnormality to specific 
age levels affects the ratings of groups 
of experts and teachers merits further 
research attention. Also, it might be 
possible to investigate the effect of in- 
ereased structure in the description of 
each item, in terms of such character- 
istics as frequency of the symptom, on 
the judgments. Other extensions of 
this project would concern themselves 
with the variance contributed to 
teacher ratings of such variables as 


their age, teaching competence, and 
socioeconomic status. 
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ERRATUM 


Because of a statistical error, the article by Gerald W. Faust and Richard 
C. Anderson entitled “Effects of Incidental Material in a Programmed Russian 
Vocabulary Lesson” which appeared in the Journal of Educational Psychology, 
1967, 58, 3-10, incorrectly stated that in both experiments the Context groups 
recalled significantly more overtly-practiced Russian words than did No- 
Context groups. In fact, the overall comparison of Context and No-Context 
groups was not significant in either experiment. However, the Context groups 
did show significantly greater (p < .05) recall than the No-Context groups In 
both experiments for overtly-practiced Russian words which were easy to pro- 
nounce and for subjects with faster-than-median training times. Consequently, 
despite the statistical error, the main conclusions reported in the paper are sup- 
ported by the data. 


Journal of Educational Psychology 
1967, Vol. 58, No. 3, 181-188 


EFFECT OF FEEDBACK FROM TEACHERS 
TO PRINCIPALS: 


ROBERT W. DAW? aw N. L. GAGE 
Stanford University 


Each member of an experimental group of 151 elementary school 
principals was given feedback concerning his teachers’ ratings of their 
actual and ideal principals on 12 behaviors. These principals were sub- 
sequently found to differ significantly, in the direction of teachers’ 
preferences, from 143 principalsin a control group. Initial differences in 
ratings were controlled by analysis of covariance. A 2nd, nonpretested 
control group did not differ from the pretested control group; hence the 
pretest itself did not produce the effect and difference between experi- 
mental and control groups was attributable to the feedback itself, 2 
intervals between feedback and 2nd rating, 2 forms of feedback, the 
principal's age and experience, and the sequence and direction of the 
rating-scale items were found to be nonsignificantly related to the effect 
of the feedback. The results suggest that feedback of this kind im- 


proves the behavior of elementary school principals. 


It is highly plausible that feedback 
regarding how others feel about one's 
behavior will affect one's behavior. 
Whether this maxim will hold under a 
given set of practical circumstances 
must, however, be determined empiri- 
cally. In the present experiment, ele- 
mentary school principals were told 
how their teachers rated them and an 
ideal principal; other principals, simi- 
larly rated, were not given this in- 
formation. 

One theoretical justification for hy- 
pothesizing that feedback of this kind 


*A more detailed presentation of the 
data and instruments is available in the 
first author’s doctoral dissertation, “Chang- 
ing the Behavior of Elementary School 
Principals Through the Use of Feedback,” 
on file in the Stanford University Library. 
The dissertation was written under the di- 
Tection of the second author. Support for 
the research was provided by the Graduate 
Division of Stanford University and by a 
small grant (MHI544-01) from the United 
States Public Health Service. The present 
Teport was written by the second author 
during tenure as a Fellow at the Center for 
Advanced Study in the Behavioral Sciences 
and as & special Fellow of the National 
Institute of Mental Health. 

Now at Santa Maria Joint Union High 
School District, Santa Maria, California. 


changes behavior has been developed 
by Gage, Runkel, and Chatterjee 
(1960). In brief, the rationale is that 
the feedback will inform some princi- 
pals that their teachers evaluate their 
behavior less favorably than the prin- 
cipals might desire. If we assume that 
principals respect their teachers’ opin- 
ions, we can expect this information 
to create in the principals a condition 
of imbalance (Heider, 1958), asym- 
metry (Newcomb, 1959), incongruity 
(Osgood & Tannenbaum, 1955), or 
dissonance (Festinger, 1957). To re- 
move or reduce this condition, that is, 
to restore a condition of equilibrium or 
consistency, the principals are likely 
to change the behaviors concerned in 
the directions desired by the teachers. 
After enough time has elapsed to al- 
low such behavior changes to occur 
and to be perceived by the teachers, 
a second description of the principals’ 
behaviors by their teachers will re- 
flect such changes.* 


$ William McGuire (personal communi- 
cation, August 15, 1966) has suggested that 
self-esteem theory—that is, that persons be- 
have so as to maximize self-esteem, not to 
minimize inconsistencies, or discrepancies— 
is more relevant to our experiment. This 
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Positive results in experiments on 
the effect of feedback of ratings have 
previously been obtained (Bryan, 
1963; Gage, 1963; Gage, Runkel, & 
Chatterjee, 1963). In those experi- 
ments, teachers were rated by their 
pupils and then, on a subsequent 
rating by the same subjects, were 
found to differ significantly, and in the 
direction of the raters’ ideals, from a 
control group not given the ratings. 

The present experiment was aimed 
at determining whether the same ef- 
fects would be found with feedback 
from teachers to principals. It also 
incorporated some refinements in de- 
sign that permitted testing rival hy- 
potheses as to the cause of the change 
in rated behavior. In addition, data 
were gathered concerning different 
time intervals, forms of feedback, 
and personal characteristics of the 
principals. 


METHOD 


Subjects 


The subjects were 455 elementary school 
principals in all the counties in California 
in the fall of 1962. Because they were as- 
signed at random, the principals in the ex- 
perimental and control groups had about 
the same number of pupils and teachers, 
and were similar in sex, age, educational 
level, and years of experience as a princi- 
pal, as shown in Table 1. Besides the ex- 
perimental (E) group, there were two con- 
trol groups: Control Group Cı, which was 
rated on both the first and second occasions, 
and “posttest only" Control Group Ci, 
which was rated only on the second occa- 
sion. 

Each superintendent of an elementary 
school district (including unified districts) 
in California with more than one principal 
was invited to send the name of the first 
principal in the alphabetical listing of prin- 
cipals in his district, and then that principal 
was invited to participate. In districts with 


kind of alternative to consistency theory 
has been outlined by Deutsch, Krauss, and 
Rosenau (1962), Steiner and Rogers (1963), 
and McGuire and Millman (1965). 


TABLE 1 
DISTRIBUTIONS, MEANS, AND RANGES ON 
VaRIOUS VARIABLES FOR EXPERIMENTAL 
AND CoNTROL GROUPS 


Experi- Control |Control 


Variable mental group 1 group 2 
PY (6) | (69 
Kind of district 


District with superintend- 
ent other than Hu 105 it 115 


where the principal 

also acts as the superin- 

tendent 46 32 46 
Male 137 130 146 
Female 14 13 15 

Education 

No B.A. degree 1 0 1 
B.A. degree 49 40 4 
M.A. degree 97 100 115 
Ed.D. degree 1 ü 4 


M 509 512 | 507 

Range 185-1011/185-1157/200-1175 
ÉL n$ [oS ed 
Number of years of Levee: 

ae Zim rob raf. fos 
Number of teachers in the 

M 14.8 | 153 | 155 

Range 8-34  |8-97 8-99 


Note.—For the rimental Group, N = 151; for 
Control Group 1, N = 143; for Control Group 2, N =' 101, 


a single principal who also acted as super- 
intendent, the superintendent-principal was 
directly invited to participate. Although 
initial attrition was about 25%, the subse- 
quent rate of participation was never less 
than 93% of the principals invited. That 
is, of the 1007 original contacts, 752 yielded 
the name of the person who participated. 
To insure the anonymity of the teacher 
responses, all schools with less than eight 
teachers were eliminated, leaving 500 prin- 
cipals. Of these, 455 completely met all 
other requirements for inclusion in the ex- 
periment. Obviously, this final group may 
have been biased toward containing super- 
intendent-principals (a) interested in what 
their teachers thought of their actions oF 
(b) reluctant to refuse to participate. 


Methods and Schedule of Data Col- 
lection 


The study was made in the school year 
1962-63. Letters of invitation were sent on 
November 30. To minimize the possibility 
that principals would discuss the project 
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with one another, only one principal was 
invited from each district. By January 5, 
752 principals’ names had been received; 
of these, 252 were excluded because their 
schools had less than eight teachers. Be- 
tween January 2 and 9, booklets entitled 
“What Do They Expect?” (WDTE) were 
mailed to 340 principals in Groups E and 
Cı. These booklets, presenting the research 
as the tryout of a “Principal’s Information 
Project,” required the principal to rate him- 
self and his ideal principal on 12 items. 
In addition, the principal was asked to pro- 
vide the information summarized in Table 
1 and to indicate how many Teacher Opin- 
ion Booklets would be needed to collect 
ratings of the principal from his teachers. 
(The WDTE booklet was mailed to Group 
Ca just prior to the mailing of the posttest 
materials.) Principals were randomly as- 
signed to Groups E, C; , and Cz prior to the 
mailing of the WDTE booklet. The rate of 
return of these booklets for the three groups 
ranged from 94.0 to 98.7%. 

Teacher Opinion Booklets were mailed 
to the principals on January 21; only eight 
of the 336 booklets mailed were not re- 
turned. In these booklets, the teachers 
rated their actual and ideal ("best imagina- 
ble") principal on the same 12 items. 

On February 11, booklets entitled “Re- 
port on Your Teachers’ Opinions” (RYTO) 
were mailed to the principals in Group E. 
In these booklets, as shown in Figure 1, the 
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principal was given, for each of the 12 items, 
histograms showing the percentages of his 
teachers who described him and their ideal 
principal with each of the six response- 
alternatives. À randomly chosen half of the 
experimental group (the “a + h" group) 
received both the histograms and the me- 
dians (indicated by arrows) of the ratings 
of the actual and ideal principal; the re- 
maining principals (the “a” group) were 
given only the medians of the ratings by 
their teachers. These RYTO booklets were 
withheld from Groups Ci and C; until after 
the second round of ratings of principals 
by their teachers. 

A follow-up questionnaire, designed to 
encourage careful reading of the RYTO's, 
was mailed on February 25 to all principals 
in Group E. This questionnaire and a re- 
minder elicited responses concerning the 
RYTO from 93% of the experimental group. 

The second round of Teacher Opinion 
Booklets was mailed to a randomly chosen 
half of Groups E and C; on March 25. A 
letter had been mailed a week in advance 
asking the principal to set aside a specific 
day for the administration of these instru- 
ments, They were returned with little delay. 
Teacher Opinion Booklets were mailed to 
the second half of Groups E and Ci on 
May 6. The use of two posttest dates per- 
mitted determining whether the effect of 
the feedback changed with the length of in- 
terval between feedback and posttest. 


Item: Acts Promptly to Fulfill Teacher Requests 


Fia. 1. Form of feedback, arrow-plus-histogram, in 


“Report on Your Teachers’ Opinions." 
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Items Concerning Principal Behavior 


The teachers described their actual prin- 
cipal and their ideal principal on the fol- 
lowing 12 items concerning principal be- 
havior: 

1. Encourages teachers with a friendly 
remark or smile. 

2. Gives enough credit to teachers for 
their contributions. 

3. Does not force opinions on teachers. 

4. Enforces rules consistently. 

5. Criticizes without disparaging the ef- 
forts of teachers. 

6. Informs teachers of decisions or ac- 
tions which affect their work. 

7. Gives concrete suggestions for improv- 
ing classroom instruction. 

8. Enlists sufficient participation by 
teachers in making decisions, 

9. Demonstrates interest in pupil prog- 
Tess, 

10. Interrupts the classroom infrequently, 

11. Displays much interest in teachers’ 
ideas, 

12. Acts promptly in fulfilling teacher 

requests. 
The items were based on ideas obtained 
from Campbell and Gregg (1957), Gross, 
Mason, and McEachern (1958), Guba and 
Bidwell (1958), Medsker (1956), and vari- 
ous elementary school teachers. Each item 
was worded in both positive and negative 
directions in Forms A and B, respectively. 
Forms A; and B; placed the items in reverse 
order from that in Forms A; and B, .* 

The items were intended to deal with 
behaviors that could be expected to occur 
frequently, could be briefly described with- 
out qualifying phrases, and could be changed 
by the principal within the time span of 
the research in & way that could be recog- 
nized by the teachers. 

The 70 items originally written were re- 
duced to the final 12 on the basis of ratings 
of their importance, improvability, and no- 
ticeability; the ratings were made by psy- 
chologists, professors of educational ad- 
ministration, teachers, and principals. 


Teacher Opinion Booklet 


In the Teacher Opinion Booklet, the 
teacher was asked to rate his principal on 
each item using one of the six alternatives 


“The various forms—A:, As, Bi, and 
B:—were randomly assigned across schools; 
that is, every teacher in a given school re- 
ceived the same form. 
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shown in Figure 1. Then he was asked to 
rate his ideal principal on the same 12 items. 
The teachers read the directions silently 
while the principal read them aloud. The 
teacher wrote his responses on a card, put 
the card in an envelope, and sealed the 
envelope. On the cover of his booklet, the 
teacher read: “Your answers will be sealed 
in an envelope by you and sent directly to 
Stanford University. No one at your school 
or in your district will know how you an- 
swered these questions.” Further to insure 
the teachers’ privacy, the principal was di- 
rected to stand far enough away from. his 
teachers to prevent him from seeing their 
papers, to make certain that all teachers’ 
answer cards were sealed in the envelope, 
to require the teachers to put their sealed 
envelopes into the large mailing envelope, 
to permit his teachers to see him moisten 
and close the clasp of the large envelope, 
and finally to ask one of his teachers to 
drop the envelope in a United States mail 
box. The administration of the question- 
naires was standardized as fully as possible 
with printed directions to the principals 
and teachers so as to assure both groups 
that their anonymity would be preserved. 
It is noteworthy that all answer cards came 
back in sealed envelopes. 


“Report on Your Teachers’ Opinions" 


The “Report on Your Teachers’ Opin- 
ions” consisted of a booklet containing two 
charts for each of the 12 items, as shown in 
Figure 1. The discrepancy between the 
teachers’ descriptions was implicit in the 
vertical distance between the two arrows 
indicating medians, If the two arrows were 
at the same point on these scales, the prin- 


cipal could infer that his teachers saw no ` 


difference between him and their ideal prin- 
cipal in that kind of behavior. In propor- 
tion to the distance between the two arrows, 
the principal could infer that he departed 
in the given direction from his teachers 
ideal for that kind of behavior. 

To determine the principal’s reaction to 
the RYTO and to encourage him to review 
these reports, he was asked to answer six 
questions on a reaction sheet. The re- 
sponses indicated that high percentages 
(85-95%) of Group E found the RYTO 
interesting, understandable, and informa- 
tive. 


Experimental Design 


The experimental design is shown in 
Figure 2. Here, X represents the experi- 
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Fic. 2. The experimental design. 


mental treatment (feedback), O refers to 
the process of observation (ratings), and 
Os in a given row are applied to the same 
subjects, the left-to-right dimension indi- 
cates temporal order, and the rows represent 
equivalent samples of persons, X, repre- 
sents “arrow-only” feedback; Xam repre- 
sents — "arrow-plus-histogram" feedback. 
Group Ca (posttest only) was used to allow 
comparisons free of any effect attributable 
to unintended feedback received by Group 
C. through participating in the pretest; 
these comparisons would show whether the 
pretest, itself produced changes in behavior. 


RESULTS 


In describing the results, we shall 
refer to the protocols obtained from 
the teachers as follows: 

Pre-ACT—the teacher's description 
of his actual principal on the pretest. 
, Post-ACT—the teacher's deserip- 
tion of his actual principal on the 
posttest. 

Pre-IDL—the teachers’ description 
of his ideal principal on the pretest. 

For each item, the mean of the 
ratings received by a group of princi- 
pals was computed over the medians 
of the ratings of each principal by his 
teachers. The ratings of all items, 
regardless of whether they were origi- 
nally worded positively or negatively, 
Were converted to a scale in which 1 


185 


signified the desirable end, and 6 the 
undesirable end, of the continuum. 


Pre-ACT 


The random assignment of princi- 
pals to Groups E and Ci should have 
made them equivalent. But the pre- 
ACT means of these groups differed 
at the .05 level on five of the 12 items 
and on the mean of the 12 items. Also, 
the direction of the difference was the 
same for all 12 items, namely, the 
direction favoring Group E. Presuma- 
bly, however, the analysis of covari- 
ance, to be described below, elimi- 
nated this pretest bias to the extent 
that the pretest scores were reliable. 

The difference between Groups E 
and C, in pre-ACT means might have 
resulted from a greater tendency of 
subjects receiving unfavorable feed- 
pack on pre-ACT to drop out of 
Group E. In that event, the remaining 
members of Group E would be those 
who had received more favorable pre- 
ACT ratings. But when this possi- 
bility was investigated, it was found 
that, in fact, the few drop-outs from 
Groups E and Ci did not differ on pre- 
ACT in this way, and the difference 
remained unexplained. 

The two experimental subgroups 
differing in the type of feedback pro- 
vided (arrow-only and arrow-plus- 
histogram) did not differ significantly 
on any of the 12 pre-ACT means. Nor 
were there any significant differences 
in pre-ACT means among the sub- 
groups given forms differing in the 
direction of wording or the sequence 
of the items. 


Adjusted Post-ACT 

The effect of the feedback was 
measured by the difference between 
Groups E and Ci. To adjust for the 
pre-ACT differences between the 
groups, analysis of covariance was 
used. The pre-ACT rating served as 
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TABLE 2 
MEANS or TEACHERS! RATINGS OF ACTUAL 
PRINCIPALS IN EXPERIMENTAL GROUP AND 
Controt Groups C; AND C; 
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eio. to D for Very much UNLIKE? Bor ais 
mental Group (E), N = 151; for Control Group 1 
N = 143; for/Control Group 2 (C1), N = 161. 

* p< 05. 

“p< 0t. 

the control variable, the post-ACT 
rating as the dependent variable, and 
the feedback as the independent vari- 
able, Although the 12 items are corre- 
lated, it is considered worthwhile to 
examine results for each item indi- 
vidually as well as for the mean over 
all 12 items, Analyses of covariance 
for each of the 12 items and for the 
mean over Items 1-12 yielded the re- 
sults shown in Table 2. The difference 
between the adjusted post-ACT means 
was significant at the .05 level or 
better for all but two of the 12 items; 
for the mean over Items 1-12, it was 
significant at the .01 level. In all 
cases, the difference between the ad- 
justed post-ACT means favored the 
experimental group. Only on Items 9 
and 10 did the adjusted post-ACT 
means not differ at even the .05 level 
of significance. But even on these 
items, the difference favored the ex- 
perimental group. Item 9, *Demon- 
strates interest in pupil progress," may 
have been too difficult for the teachers 
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to perceive in the period allowed. 
Item 10, “Interrupts the classroom 
infrequently,” may have been rated 
too favorably on the pretest to allow 
sufficient room for improvement; the 
pre-ACT means on Item 10 were the 
most favorable (had the lowest nu- 
merical value) of all 12 items. 


Posttest-Only Control Group 


The “posttest-only” control group 
(C2) was used to eliminate certain 
possible attenuating effects on the 
comparisons. Such effects might have 
resulted from any feedback, or sensi- 
tization to the items of behavior, that 
might have been received by the pre- 
tested control group (Ci) as a result 
of their participation in the pretest. 
If such sensitization occurred, Group 
C; would differ less from Group E 
than would a nonpretested control 
group (C5) which received neither the 
feedback nor the pretest. Here the 
comparisons must be made in terms 
of unadjusted posttest means, since 
there were no pretest means with 
which to adjust the posttest means of 
Group Cz. 

Table 2 also shows the means for 
Group C2. The means for Group Ci 
and Cz did not differ significantly. 
Both of these groups did differ, in the 
same direction, from Group E. Hence, 
the pretest in itself did not affect 
Group Cı, and the improvement in 
Group E must be ascribed to the feed- 
back alone, not to the feedback plus 
the pretest. 


Adjusted Post-ACT Minus Pre-IDL 


The teachers' initial ratings of their 
ideal principal (pre-IDL) make possi- 
ble an interpretation of the direction 
of the difference between adjusted 
post-ACT means. It should be re- 
called that the feedback informed 
Group E as to how their teachers rated 
the ideal principal. The feedback 
should influence the principal to 
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TABLE 3 
Apsustep Muan Post-ACT AND PRE-IDL 
RATINGS FOR THE EXPERIMENTAL GROUP 
AND Controt Group Ci 


Adjusted Ep 
post-ACT | pre-IDL ar | PSEACT 

Item. pre-IDL 
E|/a|E|Qa]E]G 

1 1.38} 1.43| 1.09] 1.08] .29 | .35 
2 1.50] 1.58) 1.13] 1.14) .37 | .44 
3 1.61| 1.66] 1.21) 1.22} .40 | .44 
4 1.69] 1.74) 1.20] 1.17] .49 | .57 
5 1.42| 1.49| 1.13] 1.17| .29 | .32 
6 1.48| 1.51| 1.09| 1.10| .39 | .41 
T, 1.81| 1.87| 1.19| 1.25| .62 | .62 
8 1.63| 1.67| 1.20| 1.23| .43 | .44 
9 1.30| 1.34| 1.10| 1.13| .20 | .21 
10 1.23| 1.20| 1.15| 1.19| .08 | .07 
11 1.45| 1.54| 1.13| 1.18| .32 | -36 
12 1.52} 1.57| 1.13| 1.16) .39 | .41 
1-12 1.49| 1.57) 1.14} 1.17| .35 | .40 
Note.—For the Experimental Group, 


N = 151; for Control Group €; , N = 143. 


change in that direction. Hence, the 
difference between adjusted post-ACT 
and pre-IDL should be smaller for 
Group E than for Group Ci. As Ta- 
ble 3 shows, most of these differences 
were indeed smaller for the experi- 
mental group. At the time of the post- 
test, the principals who received feed- 
back came closer to their teachers’ 
desires. Since the items are interde- 
pendent, they do not, of course, pro- 
vide 12 separate tests of the overall 
hypothesis. But the means over all 
12 items also differed in the expected 
direction. 


Interval, Form of Feedback, and 
Other Variables 


In the experiment by Gage, Runkel, 
and Chatterjee (1960), experimental 
group subjects changed more if they 
had a longer interval between feedback 
and posttest. But the time interval 
Was relatively short, ranging from 18 
to 59 days. In the present experiment, 
the experimental and control groups 
Were assigned at random to either & 
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6-week or a 12-week interval group. 
Analysis of covariance revealed no 
significant difference in the two groups’ 
adjusted post-ACT means. Further, 
there was no significant effect due to 
the Feedback x Interval interaction. 
In short, the effect of the feedback 
was not a function of the interval over 
which it was measured. 

One randomly chosen half of the 
experimental groups received feed- 
back in the form of median ratings 
(arrows) only, and the other half 
received median ratings plus the fre- 
quency distributions (histograms) of 
the ratings. Either kind might rea- 
sonably by predicted to be the more 
effective: The median-only might be 
sharper, less ambiguous; on the other 
hand, the median-plus-distribution 
might be more convincing. But no 
significant difference was found for 
any item between the adjusted post- 
ACT means of the subgroups receiv- 
ing these two kinds of feedback. 

When the principals were divided 
into subgroups on the basis of age 
(40 or younger versus 41 or older), 
experience (5 years or less as ele- 
mentary school principal versus 6 
years or more), and form (Ai, Az, 
Bi, Bs) of the Teachers’ Opinion 
Booklet, analyses of covariance 
yielded no significant difference at 
the .05 level due to main effects or in- 
teractions. Thus, the signifiant re- 
sults that occurred due to feedback 
did not seem to vary with age, experi- 
ence, or form. The latter findings show 
that positional or directional sets did 
not significantly affect, the teachers’ 


responses. 


Discussion 


All in all, the results indicate that 
the feedback affected changes in the 
principals’ behavior. Various questions 
remain, however, for subsequent re- 
search. First, we must recall the un- 
explained pretest differences between 
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the experimental and control groups 
despite their random assignment. 
Second, we must ask whether methods 
of measuring behavior other than re- 
ratings by the same teachers would 
reveal the same kinds of behavior 
change. Would observations by trained 
Observers produce confirming evi- 
dence? Would interviews of the princi- 
pals reveal the process by which feed- 
back operates, and show whether the 
principals were consciously attempt- 
ing to change their behavior? Would 
disguised tests of teacher "morale" 
reflect the desirable changes in rated 
principal behavior? 

Third, we should determine better 
whether the improvements in the post- 
ACT ratings reflect mere improved 
"halo effect" or actual changes in 
specific behaviors. One way to proceed 
on this question would be to collect 
post-ACT ratings on uncorrelated 
items not dealt with in the feedback ; 
if behavior also improves on such 
items, the significance of the feedback 
on specific items must be questioned. 
Similarly, items differing widely in 
“changeability” could be compared 
as to the amount of change principals 
exhibit in them; presumably, if the 
ratings reflect more than general im- 
pressions, relatively unchangeable be- 
haviors should be rated as changing 
less, 

If research allays the skepticism 
implicit in these suggestions, further 
attention should be given to ways 
of enhancing the effectiveness of feed- 
back. The behavior of teachers, princi- 
pals, and persons in many similar 
roles could be made more effective by 
applying the results of such a pro- 
gram of research. 
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EFFECT OF SIMILARITY IN STRUCTURE, MEANING, 
AND SOUND ON PAIRED-ASSOCIATE LEARNING 


LOWELL SCHOER 
University of Iowa 


The purpose of the study was to determine the effect of 3 types of 
stimulus similarity on paired-associate learning of 5th-grade Ss. 80 
Ss learned each of 4 lists (total N = 320). The stimuli in List I were 
similar in meaning, those in List II in sound, and those in List III in 
structure. List IV was the control list. List III performance was sig- 
nificantly lower than that on the other 3 lists. None of the other list 
differences were significant. There was also & significant ability effect, 
but the interaction between lists and ability level was not significant. 


The relationship between stimulus 
similarity and learning has been 
found to be consistent with predic- 
tions made from Gibson’s (1940) 
theory when forms and nonsense syl- 
lables have been used as stimuli 
(e.g, Feldman & Underwood, 1957; 
Gibson, 1942). The results have moi 
been nearly so consistent when ad- 
jectives have been used (Beecroft, 
1956; Underwood, 1953). This might 
suggest that Gibson’s is an adequate 
theory of “form learning” but not of 
“word learning” (Underwood, 1961), 
or it might suggest that it is an ade- 
quate theory of word learning only 
when the dimension of similarity in- 
volved is that which might be asso- 
ciated with primary rather than sec- 
ondary generalization. The primary 
purpose of the present study was to 
investigate this latter possibility. 

Basic to such an investigation is 
Some statement of the difference be- 
tween primary and secondary gener- 
alization. For purposes of the present 
Study, primary or sensory stimulus 
generalization is defined as that gen- 
eralization which occurs as a function 
of the physical similarity among the 
stimuli, Secondary stimulus generali- 
zation is, on the other hand, defined 
88 that generalization which occurs 
3$ a function of factors other than 
physical similarity. By these defini- 


tions the generalization that would 
occur as a result of using the words 
“age” and “ago” as stimuli would be 
primary generalization, while that 
which would occur as a result of using 
"large" and "big" as stimuli would 
be secondary generalization. In addi- 
tion to strueture and meaning, words 
may also be similar in sound, for ex- 
ample, “ate” and “eight. j Such simi- 
larity, because it is a dimension of 
similarity that results from the 
sounds of the words when they are 
spoken rather than similarity in the 
structure of the stimuli, would seem 
to be more closely associated with 
secondary than with primary general- 
ization. 

Basieally, the present study was 
designed to determine the effect of 
stimulus similarity in structure, 
meaning, and sound on paired-asso- 
ciate learning as a means of investi- 
gating the operation of primary and 
secondary generalization in word- 
learning. 

Because there is some suggestion 
from the literature that high- and 
low-ability learners differ in their 
susceptibility to the effects of gener- 
alization (Sehoer, 1963), the study 
was also designed to investigate the 
effects of these three sources of gen- 
eralization on learners of varying 
abilities. 
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METHOD 


Lists 

As indicated above, words may be simi- 
lar in meaning, sound, or structure and a 
pair of words may be similar in one of these 
dimensions and dissimilar in the other two. 
The first step in setting up the learning 
lists employed in the present study involved 
choosing three pairs of words similar in 
meaning but dissimilar in sound and struc- 
ture, three pairs of words similar in sound, 
but dissimilar in meaning and structure, 
and three pairs similar in structure, but dis- 
similar in meaning and sound. All the words 
used were chosen from among the first 2,000 
in the Thorndike-Lorge word list (Thorn- 
dike & Lorge, 1944). The six words of each 
type constituted the stimulus words for one 
learning list. The response words were the 
same for all three lists and consisted of six 
words dissimilar in meaning, sound, and 
structure. A fourth list, the control list, used 
the same six response words as the three 
lists described and six stimulus words which 
were also dissimilar in meaning, sound, and 
structure. The four learning lists are given 
below. 


1% Hn 
(Meaning) (Sound) 
TALK-from 1-drop 
SPEAK-kept EYE-move 
LARGH-move WON-care 
Bia-drop ONE-poor 
GLAD-care ATE-Írom 
HAPPY-poor EiGHT-depth 
p IV 
(Structure) (Control) 
AGE-move SPEND-move 
AGO-drop HAPPEN-poor 
own-from GOLD-care 
ONE-kept Bit-kept 
ELSE-poor TALL-from 
EAST-Care LAUGH-drop 
Subjects 


The subjects (Ss) were students in 16 
fifth-grade classrooms in the Davenport, 
Iowa, publie schools. Each of the four lists 
was learned in four classrooms. 

The Iowa Tests of Basic Skills composite 
was used as the measure of ability and the 
levels used were those set off by the quartiles 
on the Iowa norms. Because the statistical 
design employed required proportionality, 
a number of Ss who were in the classrooms 
where the lists were learned were not in- 
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cluded in the analysis. The final number in 
the analysis was 320, with 20 Ss from each 
level on the ITBS for each of the four 
learning lists. 


Apparatus and Procedure 


'The learning materials were presented to 
Ss in intact classes using a Dunning Ani- 
matic to project the lists onto a screen in 
the front of the room. All the classrooms 
were equipped with pull-down screens and 
the Dunning Animatic is completely porta- 
ble so a minimum of time was required to 
set up the needed equipment in the class- 
rooms. After the equipment was set up, all 
Ss were given a sheet of paper and asked to 
spell the six response words. Although this 
involved some response pretraining it was 
felt to be necessary because Ss were going 
to write their responses in the experiment, 
which necessitated finding out if there were 
any students who could not spell them. 
All such students, and there were very few 
of them, were dropped from the sample. 

After Ss had spelled the response words, 
the nature of the experiment and the use of 
the answer sheet was explained to them. 
They were told that they would be shown 
pairs of words and that they should study 
the pairs so that later when they were 
shown only the first word of the pair they 
could write down the second word. The an- 
swer sheet they were given provided space to 
write their responses. These answer sheets 
were set up in such a fashion that Ss could, 
after each response trial, quickly fold their 
answers on that trial under and out of sight. 
In addition, each S was provided with a 
mask to be used to cover the responses after 
he made them on a given trial. In this way 
all previous responses given by S were out 
of sight when he wrote the response to a 
given stimulus word. 

A practice list of three pairs of words was 
used to be sure Ss understood the directions 
and to give them practice in using the an- 
swer sheet. The fifth-grade Ss used had 
little difficulty understanding what they 
were to do and became relatively proficient 
in the use of their answer sheets during the 
period of practice provided by the practice 
list. 
Learning and recall trials were alternated 
during the learning of both the practice ani 
the experimental lists. During the Jearning 
trials each pair was presented for 2 secon S. 
After all the pairs had been presented there 
was a 5-second delay interval after which 
only the stimulus words of each pair were 
presented. 
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The S was given 5 seconds to write down 
the word that "went with" each stimulus 
word as it was shown. This recall trial was 
followed by a second learning trial which 
was in turn followed by a second recall trial, 
ete. The Ss were given four learning and 
four recall trials on the practice list and six 
learning and six recall trials on the experi- 
mental list. Each S learned only one list. 

RESULTS 

The criterion measure for the 
analysis was the number of correct 
responses on the sixth recall trial. 
The data were analyzed using a fac- 
torial design (Lindquist, 1953). 

The means for the four ability 
levels on the four lists are shown in 
Table 1, The F for lists was signifi- 
cant at well beyond the .01 level (F 
= 636, df = 3/304) as was the F for 
ability (F = 5.61, df = 3/304). The 
interaction between lists and ability 
was not significant (F = 1.15, df = 
9/304). 

Lindquist (1953, p. 214) describes 
a procedure for making individual 
comparisons of row or column means. 
By this procedure, a difference of .61 
was found to be necessary for signifi- 
cance at the .05 level. Because the N 
for rows is the same as the N for 
columns this is the critical difference 
for significance at the .05 level for 
both rows and columns. Applying the 
critical difference .6 to the differences 
among the lists shows List III per- 
formance to be significantly lower 


TABLE 1 
List MEANS BY ABILITY LEVEL on SIXTH 
RECALL TRIAL 


ne i 

of Basic 

Skills I p HI IV Total 
4 4.7 | 4.9 | 3.2| 5.0 | 4.5 
3 4.8 | 4.5 | 3.3 | 3.6 | 41 
2 3.9 | 3.2 | 3.8 | 43 | 3.7 
1 3.4 | 3.8 | 2.4 | 3.4 | 3.3 

Total | 4.2 | 4.1 | 3.1 | 4.1 


Note.—N = 20 per cell. 


TABLE 2 
Numser or Supsects WHo Gave Rg- 
SPONSES THAT WERE INCORRECT FOR 
SrTmuULUS GIVEN BUT CoRREcT 
FOR OTHER SIMILAR STIMULUS 
iN List ON TRIAL 6 


Subjects 


Did 
Did not 


Note.—x? = 19.94; df = 3; p < .01. 


than the performance on any of the 
other lists. None of the other list dif- 
ferences even approaches significance. 
Applying the same critical differ- 
ence to the ability levels shows the 
performance of Ss in the upper quar- 
ter to be significantly higher than 
that of those in the second and bot- 
tom quarters, while that of Ss in the 
third quarter is significantly higher 
than that of Ss in the bottom quarter. 

The list means for all sex recall 
trials combined were 3.5, 3.3, 2.9, and 
3.4 for Lists I, II, III, and IV re- 
spectively. 

There is, of course, the possibility 
that these list differences are due to 
some factor other than similarity. 
Support for the similarity explana- 
tion could be provided by an analysis 
of the errors on Trial 6, particularly 
errors which involved S giving a re- 
sponse which was incorrect for the 
stimulus shown but correct for the 
other stimulus of the pair in the list. 
The number of Ss making this type of 
error on Trial 6 for each of the four 
lists is shown in Table II. List IV did 
not involve pairs of stimuli which 
were similar; so pairs that were in the 
same relative position in the list as 
the similar pairs in Lists I, IL, and 
III were used to derive the values for 
List IV. Twenty-nine of the 80 Ss 
who learned List III made at least 
one error on Trial 6 involving the 
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response that was the correct response 
for the similar stimulus in the list. 
The numbers for Lists I, II, and IV 
were 18, 11, and 8 respectively, Anal- 
ysis of the data in Table 2 yielded a 
chi-square of 19.94 (p < .001), with 
a considerable proportion of that 
value coming from the two cells un- 
der List III. Although this does not 
prove that similarity was the only 
factor operating to produce the dif- 
ferences shown in Table 1, it does 
provide considerable support for 
similarity as a major contributing 
factor. It could, in fact, stand alone 
as support for the relatively more po- 
tent role of primary generalization 
as compared to secondary generaliza- 
tion in influencing the performance 
of Ss like those used in the study. 


Discussion 


The results rather strongly indicate 
that stimulus similarity does have a 
relatively greater negative effect on 
word learning when the dimension of 
similarity involved is that of struc- 
ture than when it is meaning or 
sound, This, in turn, suggests that 
the Gibson theory may be more ade- 
quate for word learning when pri- 
mary generalization is involved than 
when secondary generalization is in- 
volved. 

Gibson, herself, seems to have been 
more concerned with primary gener- 
alization than with secondary gener- 
alization. This, as indicated by Un- 
derwood (1961), is reflected in her 
choice of stimulus terms for her own 
studies (Gibson, 1941, 1942). Confu- 
sion, however, was created when the 
results of these studies were cited as 
support for the theory as a general 
theory of verbal learning without 
proper recognition that words may be 
similar along dimensions that may 
involve secondary rather than pri- 
mary generalization and that a theory 
adequate in dealing with one may 


not be adequate in dealing with the 
other. This, in fact, may be the case 
with the Gibson theory; it may be 
adequate in dealing with primary 
generalization but inadequate when 
secondary generalization is involved. 

In view of the as yet unsettled 
question of the effect of ability in 
paired-associate learning, it is inter- 
esting to note the significant main 
effect for ability and the linear re- 
lationship between ability level and 
learning that is evident in the last 
column of Table 1. 
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SECOND-TRY RECALL, RECOGNITION, AND THE 


MEMORY-MONITORING PROCESS 


JOSEPH T. HART 
University of California, Irvine 


The experiment reported assesses how accurately the memory-moni- 
toring process operates as an indicator of memory storage. Ss were 
administered a general information questionnaire in 3 versions: 
a speeded recall test, a 2nd-try recall test, and a recognition test. 
During the first 2 tests Ss made memory-monitoring judgments 
about those items that they could not recall. The judgments indi- 
cated whether they felt they knew the answer (FK judgments) or did 
not know the answer (FK judgments). A comparison of the propor- 
tions of correct answers, on the memory tests that followed, for the 
FK and FK items showed a significantly greater number of correct 
answers for the FK items. The results are discussed (a) in terms of 
the significance of the memory-monitoring process for a fallible re- 


trieval system and (b) in relation to the concept of availability. 


_ The experiment to be reported here 
is the seventh in a series of experi- 
ments that have been conducted 
on the memory-monitoring process 
(Hart, 1965a, 1965b). Memory-mon- 
itoring is a term used to describe the 
intervening cognitive process that be- 
gins whenever people try to remember 
something and fail. When they feil 
they must then decide whether or not 
the sought-after item is really in their 
memories, It is this process, which 
people use to make judgments about 
what they have in their heads at times 
when they cannot get an item out of 
their heads and into a response, that 
is called the memory-monitoring (or 
MEMO) process. 

Subjectively, the MEMO process is 
well-known to everyone in the famil- 
lar feeling-of-knowing or tip-of-the- 
tongue experiences that we have every 
day. When someone asks the name of 
an acquaintance, we may try very 
hard to remember and still fail in our 
efforts, yet remain convinced that we 
know the sought-after name. The 
feeling may be so strong that we per- 
sist in our efforts to remember far 
beyond the limits of practicality. 

An important question to ask about 
these familiar and ubiquitous feeling- 


of-knowing experiences is, Are they 
accurate? This question is important 
because the MEMO process would be 
of value only if it could serve as a 
relatively accurate indicator of the 
presence or absence of items in the 
memory storage system. An accurate 
MEMO process could contribute to 
the efficiency of a memory system by 
signaling when continued efforts to 
retrieve an item would be useful and 
when they would be useless. In con- 
trast, an inaccurate MEMO process 
would be detrimental to the overall 
efficiency of a memory system since 
such a system would often attempt 
to retrieve items that were not in stor- 
age and not attempt to retrieve items 
that were in storage. 

Recently a method has been devel- 
oped to assess the accuracy of the 
MEMO process by applying a sim- 
ple recall-judgment-recognition para- 
digm. 

This paradigm makes use of the 
fact that recognition memory usually 
exceeds recall. Subjects (Ss) are first 
asked to attempt to recall some mem- 
ory items; for those items they cannot 
answer they are required to give a 
memory-monitoring judgment about 
whether or not they feel they know 


193 


194 J. T. Harr 


the correct answer well enough to 
recognize it among several wrong al- 
ternatives. After completing the re- 
call test, Ss are given a multiple- 
choice recognition test covering the 
same items that appeared on the test 
of recall. 

If the memory monitor is an ac- 
curate indicator of memory storage, 
Ss should do better on those recog- 
nition items which they felt they 
knew but could not recall than on 
those items they felt they did not 
know. Accuracy can be easily assessed 
by comparing the proportion of 
correct recognitions on FK (feeling- 
of-knowing) items with the propor- 
tion correct on FK (feeling-of-not- 
knowing) items for each S. 

The general finding, so far, has been 
that the memory-monitoring process 
does serve as a relatively accurate in- 
dicator of memory storage. Across 
seven experiments, the proportion of 
correct recognitions for the FK items 
has averaged .61, while the FK aver- 
age has been .41, This result has held 
for both the recall and recognition of 
general information questions (such 
as, “Which planet is the largest in our 
solar system?”) and for the recall and 
recognition of recently learned paired 
associates (such as, “WHITE-CXJ”). 

The experiment reported in this 
paper extends the investigation of 
MEMO accuracy, by seeing if Ss can 
predict their successes and failures 
when they are given a second chance 
to recall items that previously eluded 
them. The recall-judgment-recogni- 
tion paradigm is altered to insert a 
second chance to recall and a second 
chance to make MEMO judgments 
before the recognition test. 


METHOD 


Subjects 


The eighteen Ss in the experiment were 
undergraduates at the University of Cali- 


fornia, Irvine; all were enrolled in a course 
on Personality and Cognition. 


Materials 


Seventy-five general information ques- 
tions were used as the test materials; these 
questions had been used in previous expe- 
riments (Hart, 1965a, 1965b). “What is the 
capital city of New Mexico?" is a question 
from the test. On the recognition test this 
question appeared with the alternatives: “a. 
Albuquerque b. Santa Fe c. Los Alamos d. 
Carlsbad." All of the recognition test ques- 
tions gave four alternatives. 


Procedure 


The procedure was divided into three 
parts. In Part 1 Ss were administered a 
speeded version of the test of recall. They 
were given only 3 seconds to answer each 
question as it was read to them. The in- 
structions read, “Work quickly, provide an 
answer to every question even if you must 
guess or give an association, circle the an- 
swers you believe might be correct and 
make feeling-of-knowing judgments (FK or 
FK) for the uncircled items.” It is important 
to note that Ss were required to provide 
some kind of answer to every question, even 
if they felt that they had no idea about 
what the correct answer might be. This in- 
struction was necessary because if Ss are 
allowed to leave answers blank they some- 
times withhold answers about which they 
are not certain and assign an FK judgment 
to those items. Overcautious withholding of 
answers can result in an inflation of the FK 
accuracy score. 

In Part 2 Ss followed the same proce- 
dures, except that they were given a printed 
form of the questions and allowed to pace 
themselves. Also, in Part 2 they were asked 
to consider, when making their MEMO 
judgments, whether they would be able to 
recognize the correct answers when they saw 
them. For Part 1 the criterion question Ss 
asked themselves when making MEMO 
judgments was, “Even though I can’t re- 
member the answer at the moment, could 
I remember if given more time?” For Part 
2 the criterion question was, “Even though 
I can’t produce an answer now, do I know it 
to the extent that I could pick the correct 
answer from among several wrong answers?” 

In Part 3 of the experiment, Ss were 
given 10 minutes to complete the multiple- 
choice recognition test on the 75 items. The 
Ss were instructed to answer all the ques- 
tions, guessing when necessary. 
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RESULTS 


MEMO accuracy is evaluated by 
seeing how well Ss can predict which 
additional items they will recall when 
given a second try and which addi- 
tional items they will recognize. To 
do this it is necessary to show that 
most Ss do improve their scores on 
the second-try recall and recognition 
tests. Otherwise there would be no 
memory additions for the MEMO 
process to predict. 

In their first attempts to answer 
the 75-item questionnaire, Ss aver- 
aged 24.1 correct answers (SD = 
8.2). In their second try at recall Ss 
averaged 27.6 correct answers (SD = 
7.9), a small but significant improve- 
ment. All but one of the Ss improved 
on the second try. (One S actually 
missed one more item when he was 
given a second chance!) On the re- 
cognition test, however, even that er- 
rant S showed improvement; the 
average was 46.1 items correct (SD = 
5.9, with a range of 11-28 additional 
items correct in moving from the sec- 
ond-try test of recall to the recogni- 
tion test). 

To evaluate MEMO accuracy we 
examine only those items that the Ss 
missed and believed that they missed 
on the preceding memory test. Those 
are the items for which Ss will have 
made FK and FK judgments. We then 
look at how well those judgments pre- 
dicted the actual memory perform- 
ances for the memory test that fol- 
lowed. 

_ If the MEMO process is function- 
ing with at least a minimal level of 
accuracy then many more of the cor- 
tect additions should haye been 
judged FKs than FKs. Or, in other 
Words, people should eventually re- 
Member more items that they feel 
themselves to know (but cannot re- 
call) than items they feel they don’t 
know. The proportion of FK items 
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subsequently gotten correct can be 
compared to the proportion of correct 
FK items to evaluate MEMO accu- 
racy. 

In moving from the first test of re- 
call to the second test, the mean pro- 
portion of FK items correct was .25 
(SD = .25); for FK items the pro- 
portion correct was .05 (SD = .03). 
A t test of these means (for correla- 
ted scores) yields a value of 3.8, 
which is significant at the .01 level 
for 17 df. There were no reversals; 
for all Ss the FK proportion exceeded 
the FK proportion. 

In moving from the second-try test 
of recall to the recognition test the 
mean proportion of FK items correct 
was .59 (SD = .14); for FK items the 
proportion correct was .30 (SD) = 
.09). A t test for these means gives à 
t value of 6.6, which is significant at 
the .001 level for 17 df. All the Ss 
showed more FK recognitions than 
FK recognitions; the reported means 
indicate that nearly twice as many 
FK items were recognized as FK 
items. The FK proportion, .30, does 
not differ significantly from the .25 
value to be expected if the Ss merely 
guessed on the four-choice recognition 
items. 

It appears, from these comparisons, 
that Ss can tell what they have in 
their memory systems even before 
they can produce the memories. 
Clearly, however, the MEMO process 
is not perfectly accurate. If it were, 
the FK proportion would be 1.00 and 
the FK proportion, on the second test 
of recall, would be .00. Instead Ss 
were able to recall only .25 of the 
items that they believed they might 
produce if given a second. chance. 
What happened was that many Ss 
overestimated the number of items 
they would be able to remember if 
given a second chance. 
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If we look at the MEMO judg- 
ments made on the first-try recall test 
and use them to predict recognition 
responses rather than second-try re- 
call then the FK mean rises to .62 
(SD = .18) and the FK mean to 38 
(SD = .08). A t test for correlated 
scores yields a value of 485, which 
is significant at the .001 level for 17 
df. The monitor appears, from these 
results, to be a more accurate pre- 
dictor of recognition performances 
than recall performances. 

Since the directions to the Ss did 
not specify how much time they 
would be given on their second at- 
tempts it is possible that they were 
misled, Given more time, perhaps 
they would be able to retrieve more 
of the FK items. This possibility will 
be tested in further experiments. 


Discussion 


The results show that the MEMO 
process can be used as a relatively 
accurate indicator of memory stor- 
age. The indicator can be applied to 
the prediction of recognition successes 
and failures, and, somewhat less ac- 
curately, to the prediction of second 
attempts at recall. When people feel 
that they know something, it is very 
likely that they do know it, and when 
they feel that they do not, it is likely 
that they do not. 

_An appreciation of the possible sig- 
nificance and usefulness of the 
MEMO process can be gained by 
comparing the human information- 
processing system with that of com- 
puters. Human systems are enor- 
mously flexible but they lack speed 
and, most importantly, they are fal- 
lible. It is this fact of memory falli- 
bility that makes the MEMO process 
important. If human beings were com- 
puters with nearly infallible memories 
they would not need a MEMO indi- 
cator. A memory item, if in storage, 


would be retrieved; failure to retrieve 
an item would simply mean that it 
was not in storage. 

Clearly, however people do not 
function this way—retention often 
exceeds recall. For a fallible memory 
system, in which what is retained of- 
ten exceeds what is retrieved, the 
MEMO process can be exceedingly 
useful. It can operate as a storage 
indicator at times when retrieval ef- 
forts are unsuccessful. When the in- 
dicator signals that an item is not in 
storage, the system need not continue 
to expend useless effort and time at 
retrieval. Instead, input can be sought 
that will put the item into storage. 
Or, if the MEMO indicator signals 
that an item is in storage, the system 
can persist in attempts to retrieve in- 
formation, avoiding the redundant in- 
put of information that is already 
possessed. 

Apart from its practical usefulness 
to the human memory system, the 
MEMO process is also of interest be- 
cause of the theoretical questions it 
raises about the system. The demon- 
stration that people can rely upon 
subjective experiences to predict the 
contents of their memories is of gen- 
eral interest to psychologists con- 
cerned about the functional relation- 
ships between subjective processes 
and behavior processes. 

For example, recent studies of ver- 
bal learning have emphasized the dis- 
tinction between response learning 
and association learning (Underwood 
& Schultz, 1960). One interesting 
theoretical construct that has devel- 
oped from this distinction is the con- 
cept of availability. This concept as- 
serts that associations may be present 
but production of the associations can 
be impeded until the responses are 
made available. 

The concept has been experimen- 
tally applied to show that the usual 
asymmetry between forward and 
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backward associations is removed if 
the backward associate is made as 
available as the forward. When un- 
available responses are made available 
hidden associations become visible in 
overt behavior (Horowitz, Norman & 
Day, 1966). 

These experiments on availability 
indieate that, ^Even if a response is 
not available—not overt—it is ac- 
tivity in the nervous system [Horo- 
witz, 1965, p. 8]." The point here is 
that a psychology of verbal learning 
must certainly examine carefully any 
subjective process which allows peo- 
ple to know that an association is 
present at times when the response is 
unavailable. The memory-monitoring 
process does just that. 
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THE MANY FACES 


OF INTELLIGENCE: 


CHARLES E. WERTS 
National Merit Scholarship Corporation 


Data on a sample of 127,125 college freshmen were used to study the 
relationship between high school grade average and various types 
of extracurricular talent displayed during high school. In the scientific, 
literary, leadership, speech, drama, music, and art areas, the percent- 
age of achievers was greater among students with high grades than 


among students with low grades 
grades usually won recognition 


in high school. Students with high 
in several of these extracurricular 


areas, whereas the majority of students with low grades did not show 


any extracurricular achievement. 


High school grades and aptitude- 
achievement test scores are the most 
widely used criteria for college ad- 
mission. In recent years, a variety of 
critics have argued that reliance on 
these criteria results in heavy loss of 
persons who have the capability of 
“creative” performance in “real life" 
(nonacademic) situations. These crit- 
ies (typified by Holland and Rich- 
ards, 1965) point to the low correla- 
tions of academic achievement tests 
with nonaeademie accomplishments 
as evidence that selection by aca- 
demie performance necessarily will 
miss a large proportion of persons 
who have the capability of “creative” 
Or "real life" performance. The va- 
lidity of this argument will be exam- 
ined here for those cases in which 
correlations are computed on samples 
of college students. 


Meron 
Subjects 


K The subjects were 127,125 students enter- 
ing 248 4-year colleges and universities in 


* This study is part of the National Meri 
Scholarship Corporation research = ahaa 
which is supported by grants from the Na- 
tional Science Foundation, the Ford Foun- 
dation, and the Carnegie Corporation of 
Ni ew York. This paper is an extension of an 
earlier discussion by Astin (1962) using the 
same data. 


the fall of 1961. With few exceptions, these 
students included the entire freshman class 
at each institution. The colleges were cho- 
sen to include a wide variety of types of 
institutions in all regions of the United 
States. Details of the college selection pro- 
cedure were given by Astin (1965). 


Measures 


Along with the usual registration forms, 
each student filled out a short information 
form which included the following ques- 
tions: 

1. Circle one: Male Female 

2. Your high school average (circle one): 


D C C+ B- B B+ A- A At 
5 AERA Ue Rete MINCE MUN: 


3. 

Indieate whether you have achieved 
any of the following by underlining the 
appropriate words. On the line before 
any item you underline, indicate the 
number of times you have achieved it. 
First, second, or third place in: ....« 
school science contest; ...... regional or 
state science contest; ...... national sci- 
ence contest 
mi leads in high school or church spon- 
sored plays; first, second, or third 
in regional or state speech or debate con- 
test; ..... d first, second, or third in na- 
tional speech or debate contest. 

LAUS elected to one or more student of- 
AMEN. elected president of my class; 
LAN received award or special recogni- 
tion for leadership of any kind 
MENS participated in national music pu 
test; received a rating of “good” or “ex- 
cellent” in: state music contest; 


-.....Won a prize or award in art compe 
198 
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tition (sculpture, ceramics, painting, eto.) ; 
exhibited or performed a work of art 
(painting, musical composition, sculpture) 
abr! 5... my school; ...... place other 
than my school 
sheared edited school paper or literary 
magazine; ...... had poems, short stories, 
or articles published in public newspaper 
or magazine (not school paper) or in 
state or national high school anthology; 
Dot won literary award or prize for 
creative writing 
Davidsen (1963) found that self-reported 
grades are reasonably accurate and corre- 
late 92 with school-reported grades. Data 
on a subsample of 20,000 students who par- 
ticipated in the annual National Merit 
talent search indicate that high school grade 
average as reported by the students in this 
sample is linearly related to National Merit 
Scholarship Qualifying Test (Science Re- 
search Associates, 1966) scores, the correla- 
aoe among college freshmen being about 


Design 


, Each person's responses to the 18 talent 
items were categorized into a some versus 
none dichotomy. The number of persons 
with a given high school grade average 
(HSG) that checked each item was com- 
puted and divided by the total number of 
persons with that grade average to obtain 
the percentage of achievers for each HSG 
category. Analyses were done separately for 
males and females because their talent pat- 
terns are dissimilar, and because among en- 
tering freshmen, girls have higher grades 
(M = 5.5) than boys (M = 4.7). 


Methodological Considerations 


The percentages shown in Tables 1 and 
2 must be interpreted in light of the follow- 
Ing considerations: 

1. Because the sample was composed of 
entering college freshmen, MeNemar's (1964) 
criticism of other studies for restriction in 
range on ability and criteria applies to this 
study too. For instance, the validity of 
comparing A+ students, most of whom go 
to college, with C students depends mainly 
on how close the percentage of achievers 
(nonacademic) among C students who go 
to college is to the percentage of achievers 
among representative C students in high 
school. C students (or perhaps any stu- 
dents) who have these extracurricular 
achievements may be more likely to go to 


college than those who do not. If so, the 
results will err on the conservative side. The 
percentages for drama, art, and music may 
be even more questionable, because a large 
number of the most talented students in 
these areas go on to specialized schools— 
like Juilliard School of Music—rather than 
to 4-year colleges. A sample of college stu- 
dents really should not be used to study 
talent loss resulting from use of academic 
criteria for college admission, since every- 
one in the sample has survived the selection 
process. Talent loss from this source might 
better be studied by comparing those who 
were selected against those who were not. 

2. The use of high school grade average 
as an academic performance measure is 
erude, because grades reflect nonintellectual 
factors like the student’s relationship with 
the teacher; and grades from different high 
schools and courses frequently are non- 
comparable due to variations in quality of 
the student body and in grading practices. 
Since these factors probably are sources of 
unreliability, the results again are likely to 
be conservative. A better design would in- 
clude both grades and test scores as aca- 
demic measures. 

3. Even if the above limitations were 
not present in this study, it would be diffi- 
cult to interpret changes in percentages of 
achievers with changes in HSG level as 
meaning there is or is not & necessary con- 
nection between these achievements and 
HSG. The problem can be illustrated by 
the finding that 176% of the A+ males, 
compared with 1.1% of the C males, checked 
the regional science item. What happened 
to the 824% of the A+ and the 98.9% of 
the C students who did not win? Perhaps 
all A+ but only a small portion of C stu- 
dents who participated won prizes—or per- 
haps all C students won (assuming only 
this 11% of C students participated), and 
only 17.6% of A+ students (assuming all 
A+ students participated). To assume that 
the ratio of 17.6% to 1.1% represents the 
true association between HSG and winning 
the regional science contest is to assume 
automatically that the percentage of all A+ 
students competing is the same as the per- 
centage of all C students who competed. 
If almost all A+ students and only a small 
percentage of C students competing in the 
regional science contest won, the ratio of 
17.6% to 1.1% might be a radical under- 
estimate of the true association between 
HSG and winning the contest. Similarly, 
if the reverse were true, the 17.6% to 1.1% 
ratio might be a gross overestimate of the 
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true association. Even knowledge of which 
persons were formally known to be entering 
the contest would not solve the problem 
of interpretation, since it would not be 
known how HSG affects which persons 
formally enter the contest. Thus, almost re- 
gardless of the percentage differences be- 
tween HSG groups, little can be said about 
the true association between HSG and win- 
ning regional science contests. A more ap- 
propriate manner of studying how intelli- 
gence (or HSG) is related to talented per- 
formance has been suggested by McNemar 
(1964). In his design, IQ scores are plotted 
against a criterion measure, such as literary 
creativity. The crucial point is that Mc- 
Nemar conceives of creativity as a variate 
ranging from high to low that should be 
measured for persons in a certain line of 
endeavor: In other words, writing creativity 
should be measured for writers, scientific 
creativity for scientists, ete. Thus, studying 
the percentage of state science winners at 
various HSG levels errs in comparing state 
science winners with an inappropriate ref- 
erence group—students in general. 

A related problem lies in interpreting 
the association between a specific accom- 
plishment and academic performance meas- 
ures—even if the data met the assumptions 
required for the appropriate coefficient, and 
the coefficient of association had been com- 
puted on an unbiased sample with com- 
parable academic performance measures 
and accurate talent data, The following ex- 
ample illustrates one difficulty. Suppose 
there were 100 high-HSG persons each of 
whom displayed different talents, and that 
this group were compared to 100 low-HSG, 
talentless persons. Since only high-HSG 
persons would have shown any talent at 
all, one probably would conclude that HSG 
is closely related to talented performance. 
However, if a phi coefficient were computed 
on the „whole sample between a specific 
accomplishment (yes or no) and HSG (high 
hec low) * low coefficient of 07 would 
result, since for each accomplishment 99 
of the high-HSG persons would not have 
displayed that accomplishment. Thus, a list 
of 100 small phi coefficients, one for each 
accomplishment, could be obtained. Because 
talent correlates perfectly with HSG in 
this example, the small size of each phi 
coefficient is only a reflection of the small 
proportion of high-HSG persons who had 
that particular talent. The size of the phi 
coefficient tells nothing about the true 
association between HSG and that particu- 
lar talent. The important point is that in 


actuality the correlation of HSG with a 
given accomplishment will be influenced 
greatly by the proportion of the whole 
population that engages in such a line of 
endeavor. It follows that a low correlation 
between HSG and a given accomplishment 
(computed on a broad sample) cannot be 
yalidly interpreted to mean that HSG is 
not related to that particular accomplish- 
ment, nor can a list of such low correlations 
for different talents be offered as evidence 
that HSG is not, in general, associated with 
talented performance. The same interpre- 
tive problem exists for studies that corre- 
late (on a broad sample) aptitude or in- 
telligence test scores with various types of 
accomplishments. One can substitute the 
word “aptitude” or "intelligence" for “aca- 
demic performance” in this methodological 
discussion. 

The above arguments indicate that the 
correlations of specific achievements with 
HSG are not meaningful in themselves. 
The example in which 100 low phi coeffi- 
cients were obtained suggests one interpre- 
tation: Given the list of phi coefficients, 
one ean deduce that talent im general and 
HSG are perfectly correlated by using a 
multiple correlation model. Theoretically, 
the multiple correlation model implies that 
the skills inherent in high HSG can be uti- 
lized in a variety of extracurricular achieve- 
ments. The correlation of one type of 
achievement with HSG cannot in itself be 
interpreted, precisely because some persons 
will display their academic skills in another 
type of achievement. 


RESULTS 


The percentage of achievers (per- 
sons checking an item) at each HSG 
level is shown in Table 1 (males) and 
Table 2 (females). The percentage of 
achievers rose exponentially with rise 
in HSG for the three science and the 
three literary items. These percent- 
ages plotted as a straight line on 
semilog paper. Curves for other items 
showed a generally rising percentage 
of achievement with rise in HSG. _ 

A different perspective was gained 
by looking at the number of areas in 
which a person achieved. Using the 
item grouping indicated in Table 1, 
there were seven possible areas O 
achievement: Science, Drama, Speech 


[ 


by Holland and Richards for that area. 


and Debate, Leadership, Musie, Art, 
and Literary. The percentage of per- 
sons at each HSG level that checked 
none of the achievement items may 
be compared with the percentage that 
checked items in more than one area 
of achievement. The percentage check- 
ing no item decreased systemati- 
cally from 64.5% at the D level to 
7.6% at the A+ level for males and 
from 47.2% to 5.6% for females, 
whereas only 12.8% of the D versus 
70.4% of the A+ males (15.7% versus 
72.6% for females) achieved in more 
than one area. If these nonacademic 


= 


achievements were to be used for se- 
lection purposes, the major effect 
would likely be the elimination of the 
majority of the low grade students.” 
The achievements items used here 
were similar to those used by Holland 
and Richards (1965), in which point- 
biserial correlations were given for 


?Ín a personal communication (June 6, 
1966) MeNemar suggested that these data 
"that degree of versatility in ac- 
function of intellectual 
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TABLE 1 
d PERCENTAGE OF MALES CHECKING EACH ACHIEVEMENT AT Various Hren 
ScHooL GRADE LEVELS 
Percent- » 
High school grade rrelati 
Achievement area and (ge check- M KEEN AER Sua Ag 
item no. aon 
mt) | D | c | ct |B- | B | B-|A- | A | A+ | rbi | ro 
Science 
1, School science award 9.4 | 3.1] 4.2] 5.7] 7.1] 8.5 | 12.1] 15.6 | 20.1) 31.6) .16 
2. Regional science 
award 3.5 8| 1.1] 1.9] 2.7] 3.1] 4.2] 6.5) 8.6] 17.6] .12 B 
3. National science 
award 3 e| aj o a| 2| 2] .4| -5| 7| 18] 08 
Drama 
4. Lead in school play 22.3 | 10.2 | 14.4 | 17.6 | 21.3 | 23.3 | 26.2 | 29.1 | 31.8 | 32.4 | .12 
Speech and debate 
5. Regional speech 
award 46 | 12| 2.0| 2[7| 3.5] 41| 5.6] 84/10.) 11.2) 11 .08 
6. National speech 
award 2 8) 2] aj [3] 4] 3| 4| 6] 02 
Leadership 
7. Elected to student 
office 37.6 |14.9|23.2 | 28.1 | 33.4 | 39.1 | 45.0 | 53.0 | 58.2 | 60.9 | .22 
8. Elected class president) 13.1 | 3.9| 6.5| 8.4 | 11.0 | 13.4 | 17.2 | 20.6 | 22.0 | 20.5] .18 E 
yz Lendernhip award 27.5 | 11.0 | 14.2 | 17.6 | 22.9 | 27.0 | 35.6 | 42.8 | 49.6 | 56.8 | .24 
usio 
10. National music par- 
ticipant 1.5 .8| 1.3] 1.5] 1.7] 1.5] 1.5] L6| L6| 2.8) 0t 
11. State music award &3 | 41] 5.6] 68| 7.8| 8.1] 9.5] 11.4] 12.1 11.0) .07 .02 
AM: National music award .6 3i s| 4| 5| 6] 7| 7| 9| 15] 08 
13. Art prize 34 | 38| 3.0| 3.1] 3.5| 3.5] $.7| 3.1) 3.5] 6.6) OL 
14. School art exhibit ta | 36| 39| 38| 45| 4.6] 4:3) 4.7] 6.2) 7.5] .06 | —.05 
15, Other art exhibit 3:6 | 2:6] 2.8] 3.0| 3:6| 3.6) 3.9) 4.5) 4.6) 6.1) .03 
ra 
16. ‘Editor of school pub- 
6 plioation, dr 3.2 | 2.0] 3.5| 4.6| 6.5| 7.5|10.6 | 15.8 | 18.5 | 21.4 | .15 
; D 
5 cathe other than} | | 2.0] 3.1| 4.0] 49| 5.9] 8.4|107 15.2 18.9) .12 10 
. Creati iti 
` NN 3.2 8| 10| 14| 2.0] 28| 40| 63| 85|127| .12 
uml 
ape 5 | 1.4 | 44.1 | 36.5 | 31.4 | 23.0 | 16.2 | 11.8 | 7.6 
More than one 12:8 | 18,0 | 23.4 | 29.0 | 34.0 | 42.7 | 50.8 | 60.3 | 70.4 
Note.—F h achievement item, the percentage of reons checking the item at each of the nine high school 
pes levels is shown? The sot numberof STE m ‘each high school grade level, going from left to right, is; 609, 
,479, 12,711, 12,102, 14,468, 11,988, 7,377, 4,830, and 655. 
* The base rates were calculated ithe 76,015 males, of whom 1,706 did not. respond tothe HSG question, For this 
reason the base rate is not a weighted average of the percentages for the nine HSG levels. n 
The point-biserial correlations (rbis) of each ievement item wii high school grade a, rere compute 
in order to make comparisons with equivalent correlations from Holland and Richards’ (1965) Table 2. It can be seen 
] that the point-biserial correlations within each of the six achievernent areas are close to the correlations (rHR) repo! 
by 
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complishments is & 
level.” 
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TABLE 2 


PERCENTAGE OF FEMALES CHECKING EACH ACHIEVEMENT AT Various HicH 
Scoot Grape LEVELS 


ae High school grade average Correlations 
check-| 
Achievement area and ioe tem 
item no. —r 
mt) | D | C | C+ |B- | B | B-|A- | A | A+ | roo | rar 
Pe box Blanes AR 6.9 |11| 3.2] 3.6] 45| 5.5] 7.0| 9.2] 12.4] 14.6) .12 
hae e a 2.7 ,0|..8| 1.2] 1.6] 1.9] 2.7 | 3.9] 53] 6.7] .09 10 
ren T OH RA Rp ade dy, Bf ena} .s| .s| .e 
4. Lead in School Play | 27.3 | 7.9 | 17.4] 20.8 | 23.8 | 25.0 | 29.5 | 31.4 | 34.6 | 32.7 | 11 
sero 
Ava i 58 | 23] 2.6] 3.6] 4.0] 46] 6.3] 7.8] 9.2] 11.3] .09 10 
6. National Speech 
Award 2 [oda 2 3| .8| .02 
7. Elected to Student 
Ofog 48.9 | 16.9 31.5 | 40.1 | 44.9 | 53.0 | 50.8 | 64.7 | 72.4 | .25 
8. Elected Claas Presi- 
dent $3 | 3.4] 3.2] 3.6) 4.4] 5.2] 6.5] 8.5|10.2 | 10.4] .09 E 
9. Leadership Award 35.7 | 11.2 | 15.0 | 17.4 | 23.4 | 98.7 | 39:9 | 49.8 | 55.6 | 62.0 | .30 
10. National Music Par- 
ticipant 2.9 | 1.1] 2.9] 2.8] 2.9] 3.0] 2.8] 3.0| 3.0] 3.1] .00 
11. State Music Award 12.8 | 11.2] 7.9] 9.8 | 10.2 | 12.0 | 13.8 | 15.6 | 15.7 | 14:7 | 107 .06 
12. Ni Musio 
18 2| 1.0) 1.1] 1.0] 1.2] 1.3] 14| 1.7] 19| .02 
13. Art Prize 7.0 |79| $2| 66) 7.5] 7.1] 7.1] 71| 783| 9.6] .0 
14. School Art Exhibit 9.6 | 10.1] 8.2] 8.8] 10.0] 9:7] 9.7| 97| 9.8| 10:5] .01 | —.05 
15. Other Art Exhibit 82 | 4.5] &7| 68] 7.8) 8:3] 8.4] 91| 92| 9.9] <04 
iia toni Teall) sos 3.4| 5.4 | 7.5 | 10.2 | 12.6 | 17.9 | 22.9 | 27.7 | 31.8 | .02 
Mtem 1.3 | 3.4] 45| 6.0| &1 $ 18.3 | 26.4 | .15 16 
H 6 .1| 8.9 | 12.0 | 15.3 | 18.3 | 26.4] . n 
ploy vci Aro $7 |23| 1.8] 2. Jd 18.3 | .16 
i -3 | 1.8 | +2.3 | 3.8] -4.3 | 6.8 | 10.8 | 13.1 | 18.3 | . 
Number of areas checked 
None 47.2 | 44.1 | 36.7 | 29.1 | 24.7 | 16.8 | 11.6 | 8.4 | 5.6 
More than one 15.7 | 23.5 | 30:0 | 37.4 | 42.4 | 62,4 | 60.8 | 66.8 | 72.6 
Note.—For each 


4,254, 5,950, 11,202, 11,459, ] 
and ran verd caloulate o 
exactly the 
5 ns with equi 
by Holland and Richards for that area. 


each of six areas of achievement: 
Science, Leadership, Dramatic Arts, 
Artistic, Literary, and Musical. Items 
1 2,,and 3 in the present study fell 
into Holland and Richards’ Science 
area; Items 4, 5, and 6 into their Dra- 
matic Arts area; Items 7, 8, and 9 
into Leadership; Items 10, 11, and 12 
into Musical; 13, 14, and 15 into Ar- 
tistic; and 16, 17, and 18 into Liter- 
ary. Tables 1 and 2 show that for 
each area Holland and Richards ob- 
tained correlations of HSG with 
achievement that came close to the 


achievement 
levels is shown. The actual number of ferae at each high school grade level, going from left to rig! 


t item, the Jrroentego of persons checking the item at each of the nine high school 


t, is: 89, 


678, an . 1 

culated E females, of whom 1,198 did not respond to the HSG auareecpe For this 

weighted average of the percent for the nine HSG levels. p 
ns (bin), each achievement item with high school grade DES were computed in 

uivalent correlations from Holland an "Tabl 

within each of the six achievement areas are close to the correlations (rHR) repo 


Richards’ (1965) je 2. It can be seen 


three point-biserial correlations for 
corresponding items in the present 
study, This finding is not. surprising, 
since both studies used survey data 
on large samples of college students. 
The point-biserial correlations for 
the different items are in part non- 
comparable, because each correlation 
is not free to vary within the same 
range. The maximum obtainable cor- 
relation for each achievement item 
was computed from the original cross 
tabulations by maintaining the mar- 
ginal row and column frequencies and 
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arbitrarily assigning to the achievers 
the highest possible grades, so that no 
nonachiever was assigned a higher 
HSG level than an achiever. Thus, 
it was possible to calculate the max- 
imum point-biserial correlations for 
Table 1, going from Item 1 to Item 
18: .57, .38, .12, .73, 42, .11, 81, .63, 
77, 21, .55, 18, .37, .41, .38, .55, .50, 
and .36. Each point-biserial correla- 
tion was divided by its maximum 
value, as is often done for the phi 
coefficient. The point-biserial correla- 
tions for the scientific, literary, and 
leadership items were about 25% to 
35% of their maximum, whereas the 
artistic and musical item correlations 
averaged less than 10%, with dra- 
matic arts items somewhere in be- 
tween. Although it is tempting to sug- 
gest that the ratio of the biserial 
correlation to its maximum is a fairly 
good measure of the degree to which 
academic skills are important for a 
particular achievement, this interpre- 
tation ignores the problems raised 
earlier. 

A multiple correlation of .38 for 
males and .37 for females was ob- 
tained between the 18 achievement 
items and HSG. The multiple corre- 
lation model simulates the “real life” 
situation in which the skills intrinsic 
to high HSG can be utilized in liter- 
ary and/or scientific and/or leader- 
ship and/or other types of achieve- 
ments, However, the data did not fit 
the assumptions for multiple correla- 
tion in at least two ways: (a) Some 
of the regression lines were nonlinear, 
and (b) according to MeNemar, the 
underlying distribution for a given 
type of achievement plotted against 
HSG probably is skewed. Perhaps 
more serious were the methodological 
problems that arose as a result of in- 
adequate coverage of types and de- 
grees of achievements. In the same 
Sense that the range in verbal ability 
is not adequately measured by use of 
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only very difficult test items, various 
degrees of talent must be included or 
the size of the multiple correlation 
necessarily will be limited. For 
instance, the coverage of science 
achievements in the present data ap- 
peared inadequate, since even the 
most frequently checked item, win- 
ning a high school science award, 
probably did not tap large numbers of 
students with considerable scientific 
ability. Furthermore, many other vari- 
eties of scientific achievement, such as 
publishing a scientific article, build- 
ing an intricate device, and inven- 
tions, were not covered. Considering 
that 18 items were probably an in- 
adequate sample of nonacademic 
achievements, it seems reasonable to 
suggest that with more adequate cov- 
erage it might have been possible to 
predict HSG almost as well from non- 
academic achievements as from Na- 
tional Merit Scholarship Test Scores 
—which correlated about .50 with 
HSG. The multiple correlation of 
HSG with nonacademic achievements 
may tell something about the overlap 
of skills between academic and non- 
academic achievements; however, it 
tells little about talent loss in gen- 
eral, because different types of talent 
vary widely with respect to the kinds 
of skills necessary for their successful 
demonstration. 

A low correlation between a par- 
ticular nonacademic accomplishment 
and academic achievement cannot be 
validly interpreted to mean that aca- 
demic ability is not necessary for 
that accomplishment. Therefore, one 
should not cite these low correlations 
as evidence that use of academic 
performance criteria—grades and 
achievement test scores—for college 
admission results in heavy loss of per- 
sons who are capable of “creative” 
performance in “real life” (nonaca- 
demic) situations. 
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THE MANY FACES OF TALENT: 
A REPLY TO WERTS 


JOHN L. HOLLAND ax» JAMES M. RICHARDS, JR. 
American College Testing Program 


Werts (1967) (a) argues that our correlational studies are misleading 
because of various statistical and sampling artifacts, and (b) implies 
that a heavy reliance on high school grades in selection would result in 
no great loss of students with other kinds of talent. When his data 
are retabled, they reveal that grades are an inefficient way to select 
for nonacademic talents and have little relationship with other talents. 
His explanations for the negligible correlations between academic 
and nonacademie accomplishment are contradicted by the published 


evidence. 


In a series of reports, we have 
found little or no relationship be- 
tween academic and nonacademic ac- 
complishment. In addition, we found 
that measures of academic potential 
are good predictors of academic suc- 
cess and measures of nonacademic 
potential are good predictors of non- 
academic accomplishment—but not 
vice versa (Holland & Nichols, 1964; 
Holland & Richards, 1965; Richards, 
Holland, & Lutz, 1966a, 1966b). From 
these results, and similar findings by 
others (Locke, 1963; Wallach & Ko- 
gan, 1965), we have concluded that if 
sponsors—college admissions offices, 
scholarship agencies, etc.—rely heav- 
ily on measures of academic potential, 
they will miss many people with non- 
academic talents. 

In the preceding report, Werts 
(1967) (a) argues that our correla- 
tional studies are misleading because 
of various statistical and sampling 
artifacts, and (b) implies that a 
heavy reliance on high school grades 
in selection would result in no great 


"The percentages reported in Table 1 
were computed by multiplying the per- 
centage of students at each grade level with 
a given kind of achievement by the number 
at that grade level and summing across 
grade levels to obtain the total number of 
students with that kind of achievement. 


loss of students with other kinds of 
talent. We will try to show that 
Werts’s own data do not support his 
hypotheses, and that the published 
evidence contradicts his hypotheses. 


Werts’s Data REANALYZED 


In his Tables 1 and 2, Werts tries 
to demonstrate that use of high school 
grades in selection will produce a 
group of students with many talents. 
For example, in Table 1 the selection 
of only A+ students would result in 
a group of men students of whom 
17.6% have won regional science 
awards, whereas only 3.5% of men 
students in general have won such an 
award. 

But the question of talent loss in- 
volves what you miss by various se- 
lection rules as well as what you 
get, so that Werts's table is only par- 
tially relevant to the question at 
issue. Werts’s results do indicate 
clearly that even low correlations 
yield impressive success ratios if the 
selection ratios are stringent, but this 
occurrence is neither new nor sur- 
prising. 

By reanalyzing Werts’s data, how- 
ever, it is possible to determine the tal- 
ent loss resulting from exclusive use 
of grades in selection.* From his data, 
we created a single table that shows 
what percentages of students with 
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TABLE 1 


PERCENTAGES OF TALENTED STUDENTS 


ELIMINATED BY THE Use or DIFFERENT 


GRADE LEVELS IN SELECTION 


Males 
Achievement area 
B+ | A- | A 
‘ool science award 46.1 | 66.8 | 83.2 
mee 42.7 | 61:8 | 79.9 
National science award 36.8 | 60.3 | 78.4 
Lead in school play 57.8 | 76.6 | 89.5 
Speech and debate 
nal speech award 45.7 | 65.4 | 83.6 
Nationa! speech award 48.3 | 75.0 | 87.2 
Elected to student office 55.5 | 74.7 | 88.6 
Elected class president 51.1 | 72.2 | 87.7 
Leadership award 50.4 | 71.2 | 86.5 
usio 
National musio ic t 65.4 | 81.3 | 91.6 
State musid award o 57.3 | 75.8 | 89.4 
National music award 56.9 | 76.1 | 87.9 
64.9 | 82.5 | 91.6 
School art exhibit. 64.3 | 80.1 | 90.8 
Other art exhibit. 60.4 | 77.9 | 90.2 
Editor of school publication 45.8 | 66.5 | 83.1 
Publ other than 47,0 | 67.8 | 84.2 
Creative writing award 39.2 | 59.5 | 79.1 
'umber of areas checked 
eater than one 51.9 | 72.0 | 86.7 
Academic achievement 
Top decile 0.0 | 0.0 | 26.2 
All students 66.6 | 82.7 | 92.6 


Females All students 
A+ | B+ | A-| A | A+] B+] A- | A | Ad 
97.1 | 31.8 | 54.7 | 76.4 | 97.3 | 41.4 | 62.8 | 81.0 | 07.1 
95.6 | 28.0 | 50.8 | 74.6 | 96.8 | 37.7 | 58.1 | 78.1 | 96.0 
95.1 | 26.4 | 52.9 | 62.1 | 94.3 | 33.7 | 58.1 | 73.5 | 94.8 
98.7 | 40.4 | 65.1 | 84.1 | 98.5 | 50.0 | 71.4 | 87.1 | 98.6 
97.9 | 33.1 | 57.7 | 79.7 | 97.5 | 39.9 | 61.8 | 81.8 | 97.7 
97.8 | 42.6 | 63.9 | 79.6 | 95.4 | 46.2 | 70.8 | 84.4 | 96.9 
98.6 | 38.3 | 63.0 | 83.1 | 98.1 | 47.5 | 69.2 | 86.0 | 98.4 
98.6 | 34.0 | 57.5 | 79.6 | 97.9 | 46.0 | 68.6 | 85.8 | 08.4 
98.2 | 31.9 | 57.3 | 80.2 | 97.8 | 41.7 | 64.7 | 83.6 | 98.0 
98.4 | 47.9 | 69.9 | 87.0 | 98.6 | 55.6 | 74.9 | 89.0 | 98.5 
98.8 | 40.0 | 64.6 | 84.7 | 98.5 | 48.5 | 70.1 | 87.0 | 98.7 
97.7 | 40.4 | 64.2 | 82.6 | 98.1 | 47.2 | 69.1 | 84.8 | 97.9 
98.3 | 46.9 | 69.9 | 86.5 | 98.2 | 54.4 | 75.1 | 88.6 | 98.3 
98.5 | 47.1 | 70.3 | 87.0 | 98.6 | 54.0 | 74.2 | 88.5 | 08.5 
98.5 | 44.2 | 67.6 | 85.8 | 98.4 | 50.6 | 71.6 | 87.5 | 98.5 
97.7 | 30.2 | 55.2 | 78.3 | 97.5 | 36.9 | 60.1 | 80.4 | 97.6 
97.4 | 32.4 | 56.6 | 78.8 | 97.0 | 39.1 | 61.7 | 81.3 | 97.2 
96.5 | 25.5 | 48.8 | 74.2 | 96.5 | 31.2 | 53.2 | 76.3 | 96.5 
98.2 | 36.6 | 61.5 | 82.3 | 08.1 | 44.5 | 66.9 | 84.0 | 98.1 
91.2| 0.0| 0.0| 0.0 |87.0| 0.0| 0.0| 4.9 | 89.5 
99.1 | 47.8 | 70.8 | 87.3 | 98.7 | 59.0 | 77.9 | 90.5 | 99.0 


Note,— This table is based on a 


of data presented by Werts (1967), Entries are the percentage of students 


with each kind of achievement who have grades lower than the column head. 


various kinds of achievement are elim- 
inated by the use of various grade 
levels as selection scores. We have 
not considered grade levels below B-+ 
because below that level a college 
usually has a nonselective admissions 
policy. When you look at Table 1, it 
is clear that the selection of only A+ 
or A students (a selection rule that 
will admit nearly all students in the 
top decile on grades) will result in 
the elimination of 74-93% of all stu- 
dents with various kinds of nonaca- 
demic accomplishment. To take an- 
other more concrete example, if you 
only select the A or A+ students 
(about the top decile of academic 
talent), you would get 1,843 class 
presidents, but you would miss 11,096 
class presidents. If you select only 
students with B+ grades or higher 
you reject 51.995 of the men and 
36.6% of the women who have checked 


more than one kind of achievement. 
In short, the use of grades as an ef- 
ficient sign for the selection of multi- 
talented persons is not warranted by 
the Werts data. 


More Evipence 


In this section, we will show that 
the published evidence strengthens the 
hypothesis that academic and non- 
academic accomplishment are largely 
independent dimensions, and that the 
numerous replications that support 
this hypothesis cannot be invalidated 
by the peculiarities of the data or the 
statistical problems that Werts (1967) 
and MeNemar (1964) have suggested. 

1. Do the small percentages of stu- 
dents with nonacademic accomplish- 
ments present a misleading picture of 
the actual relationships between aca- 
demie and nonacademic accomplish- 
ment? Our Table 1 indicates that this 
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argument has a little validity, since 
in only a few cases is the percentage 
of nonacademic achievers eliminated 
quite as large as the percentage of all 
students eliminated. Nevertheless, al- 
most as many nonacademie achievers 
as compared to all students are eli- 
minated, and many more nonaca- 
demic than academic achievers are 
eliminated. 

The most persuasive refutation of 
the percentage argument is provided 
in other research (Richards, Holland, 
& Lutz, 1966a, 1966b) by a special 
criterion scale developed to assess 
Recognition for Academie Accom- 
plishment (RAA) at the college level. 
This scale of only five items, such as 
being on the Dean’s honor list, se- 
lected for an honors program, elected 
to Phi Beta Kappa etc., has more sta- 
tistical defects than the typical non- 
academic achievement scales that 
Werts criticizes. Its estimated relia- 
bilities range only from .31 to .50; its 
means and SDs for various samples 
are less than 1 so that it has a vio- 
lently skewed distribution of student 
scores. Therefore, if the negligible 
correlations that are usually found 
between academic and nonacademic 
accomplishment are simply due to 
statistical defects, the RAA scale 
should also be uncorrelated with 
grades. Yet the RAA scale is mod- 
erately correlated with grades and 
academic test scores, for both con- 
current and predictive relationships, 
while the nonacademic achievement 
scales are not correlated with grades, 
80 that the evidence provides both 
convergent and discriminant validity. 
In other words, the correlations do 
Present an accurate picture of the ac- 
tual situation, and the content of 
these scales is a more important de- 
terminant of correlation than are 
some of their statistical properties. 
Finally, in the Holland and Richards 


(1965) report there are several cri- 
terion scales that have normal dis- 
tributions (Total Competencies and 
Originality), but they too have neg- 
ligible relationships with grades. 

2. What happens when you use 
academic and nonacademic predic- 
tors? If Werts’s hypothesis has valid- 
ity, then the selection of students by 
reliance on high school grades would 
lead to the selection of both academic 
and nonacademic achievers. Several 
studies at the National Merit Schol- 
arship Corporation, however, contra- 
dict this hypothesis. In a simulated 
selection study, Nichols & Holland 
(1964) found that selection by high 
grades produced a student sample 
that obtained high grades in college 
but a poor record of nonacademic ac- 
complishment. In contrast, selection 
by nonacademic achievements pro- 
duced a student sample that obtained 
an outstanding record of nonacademic 
accomplishment in college without 
lowering the level of academic per- 
formance. Other studies at National 
Merit by Holland (1961, 1964), Hol- 
land and Nichols (1964), Astin (1962), 
Nichols (1966), and Roberts (1965), 
without exception, suggest that you 
will get different groups of students 
with academic and nonacademie cri- 
teria—so different that they are not 
useful substitutes for one another. 

More recently, using large diverse 
student samples, Richards, Holland 
and Lutz (1966b) at ACT found that 
high school grades would not predict 
nonacademic accomplishment in col- 
lege, although nonacademic predic- 
tors, once again, had useful predic- 
tive validities across 22 colleges. 

3. Is the lack of relationship be- 
tween academic and nonacademic ac- 
complishment due to a narrow range 
of talent? In a representative sample 
of 18,378 students taken from a pop- 
ulation of 612,000 college applicants, 
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Holland and Richards (1966) again 
found that the relationships between 
ACT test scores and high school 
grades are generally negligible. This 
refutes the argument that our results 
are due largely to the restriction in 
range resulting from the college se- 
lection process. In an older article, 
using à very narrow range of talent 
(National Merit Finalists), Holland 
(1961) found that the NMSQT and 
the SAT correlate as much as .64 and 
.57 over a period of 1 year, but neither 
test had any appreciable concurrent 
relationship with scientific or artistic 
accomplishment outside the classroom 
(r's ranged only from —.19 to +.13). 
In short, there is no longer any evi- 
dence that range of talent is a plau- 
sible explanation of all of the repeated 
occurrence of negligible relationships. 
4. What about...? We have done 
most of the things Werts cites as de- 
sirable in studies of academic and 
nonacademic achievement. In the 
specifico study Werts criticizes, we 
used both test scores and grades as 
measures of academic accomplish- 
ment, and found that grades produce 
relationships that are equivalent to 
those obtained by the use of a more 
reliable test (see Table 4, Holland 
& Richards, 1965). We have studied 
academic and nonaeademie accom- 
plishment, in a large sample of college 
applicants (Holland & Richards, 
1966) and found again that academic 
and  nonacademic accomplishment 
have only negligible relationships. 
Last, we studied each nonacademic 
accomplishment for a subgroup of 
students presumably interested in that 
accomplishment (Holland & Richards, 
1966); for example, we studied the 
relationship between scientific accom- 
plishment and academie accomplish- 
ment using only students intending to 
major in scientific fields. In a few 
cases, interest does act as a modera- 
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tor of the relationship between aca- 
demie and nonacademic accomplish- 
ment so that the relationship is 
noticeably greater within subgroups 
than for the total group. Even in these 
eases, however, academic accomplish- 
ment is still an inefficient, predictor of 
nonacademic accomplishment. 
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Student expectancies concerning persistence in college and sources 
of conflict leading to withdrawal from college were related to se- 
lected precollege performance, scholastic ability, and personality 
variables. Perceived reasons for college withdrawal generated a 3- 
dimensional space, the defining vectors being academic and work 
skills and their utilization, motivation, and adjustment. Correla- 
tions computed separately for the groups reporting high and low 
probabilities of college dropout suggested that the former group is 
more concerned with satisfying the expectancies of their parents 
and that failure to do so is anxiety and guilt producing. Lack of 
commitment to educational pursuits coupled with this need to sat- 
isfy parental expectancies apparently leads to initially adequate 
performance but subsequent underachievement. 3 times as many 
of this group withdraw as do the low probables. Similar differences 


were observed between remaining and dropout students after 3 


terms. 


Periodically, the problem of the 
college dropout emerges in the re- 
search and educational literature as 
one of intense concern, particularly 
to those who view withdrawal as a 
measure of institutional efficiency. De- 
spite the effort, both conceptual and 
research, and the unanimity con- 
cerning the wastefulness of attrition 
(Dressel, 1943; Macintosh, 1948), the 
college attrition rate remains discon- 
certingly stable. Summerskill (1962), 
after reviewing some 35 studies span- 
ning a period of 49 years, concluded 
that, on the average, 50% of matric- 
ulating students withdraw during the 
normal 4-year period. Data reported 
by the Dean of Admissions and Reg- 
istrar at the Pennsylvania State Uni- 
versity indicate that 50% of the stu- 
dents enrolling in the fall semester of 


*The author gratefully acknowledges the 
assistance of Craig Messersmith in the con- 
struction of the inventory scales and early 
development of the study. The author is 
now at the Georgia Institute of Technology. 


1955 withdrew through 5 years. In a 
more recent study at Penn State, 
Lindsay, Marks, and Hamel (1966) 
reported an all-university, that is 
both main and Commonwealth cam- 
pus, attrition of 55% within 4 years. 

Although estimates of the magni- 
tude of college dropout seem unam- 
biguous, the reasons for and nature of 
this phenomenon are certainly much 
less clear. With few exceptions the 
research on college dropouts is char- 
acterized by a lack of an adequate 
conceptual base and a reliance upon 
ex post faeto methodology. The typical 
research design employed in this 
area consists of identifying dropouts 
and then examining either a broad 
range of precollege characteristics, 
for example, age (Gable, 1957), sex 
(Iffert, 1957), parental occupation 
(Suddarth, 1957), community size 
and location (Strang, 1937), scho- 
lastie ability (Summerskill, 1962), and 
high school preparation (Iffert, 1957), 
or postwithdrawal factors, such as 
reasons for withdrawal (Iffert, 1957), 
feelings relating to the educational 
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experience and withdrawal  (Iffert, 
1957), motivation (Mercer, 1941; 
Woods & Chase, 1937), and adjust- 
ment (Farnsworth, Funkenstein, & 
Wedge, 1955). The use of adequate 
controls necessitated by this type of 
research has been singularly lacking, 
which renders the frequently contra- 
dietory results only more uninterpret- 
able. The only reliable conclusion 
emerging from this mass of research 
activity is that students with poor 
high school preparation or lower scho- 
lastic aptitude (and in many cases, 
both) have a higher incidence of col- 
lege withdrawal. Summerskill (1962) 
provides an excellent review and crit- 
ieism of the college dropout literature. 

An immediate and appealing con- 
jecture is that the majority of with- 
drawals from college are directly 
traceable to academic difficulty. What 
evidence there is, and admittedly this 
is a difficult thing to measure, sug- 
gests that this is not the case. Sum- 
merskill (1962) concludes that only 
about one-third of college dropouts 
are due to academic difficulty. Of the 
total number of students matricu- 
lating at Penn State in the fall se- 
mester of 1955, 15% (or roughly 30% 
of the dropouts) were dismissed by 
the University for academic reasons. 
Data on file with Student Affairs 
Research at Penn State are also rele- 
vant. These data indicate that the 
correlation between academic achieve- 
Ment, as measured by grade-point av- 
erage (GPA), and persistence, as 
Measured by the number of terms a 
student remains in school, is at best 
moderate, with many withdrawing 
students (particularly female) having 
cumulative averages well above the 
minimum required. 

In some cases the reasons for col- 
lege withdrawal are clearcut, perhaps 
academic failure or simply running 
out of funds. If our estimates are any- 


where near correct there still remains 
a sizeable proportion of college drop- 
outs, around 60%, for whom the drop- 
out process is considerably more com- 
plex and probably multiply caused, It 
is this group to which the present 
Studies are addressed. 

In the present study college with- 
drawal is viewed as the behavioral 
outcome of a decisional process the 
basic data of which are those percep- 
tions, cognitions, and stimuli which the 
student judges to be relevant to his 
college behavior. The typical student 
entering a college setting has devel- 
oped through experience with a similar 
environment, for example, high school, 
and relationships with significant per- 
sonages and institutions, such as par- 
ents and church, an extensive and in- 
tricate set of expectancies relating to 
his own behavior and the behavior 
of others, and the consequences of his 
behavior, These expectancies’ and 
their environmental confirmation. or 
disconfirmation serve as antecedents 
to observed behavior. The student in 
the two-choice withdrawal situation 
can be treated as a rational organism 
who evaluates and relates certain 
cognitions concerning himself, for ex- 
ample, ability, performance, persist- 
ence, educational and career goals, 
and perceptions and cognitions con- 
cerning environmental inputs, for ex- 
ample, parental and peer pressures, 
educational “payoff,” experience of 
success or failure, and then "decides" 
on one course or the other. Superim- 
posed on this matrix of perceptions 
and cognitions and. affecting both 
their formal properties and possibly 
their relationship to the behavioral 
outcome, that is, the decision and 
act of withdrawal, is the class of judg- 
mental skills and preferred modes 
of dealing with informational input 
which are broadly termed personality 
variables. 
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Within this framework, research 
efforts would tend to focus on certain 
perceptual and cognitive structures 
such as expectancies or anticipations 
regarding educational behaviors and 
outcomes, the identification and as- 
sessment of a class of admissible end 
states (goals), the congruence of ex- 
pectancies and perceived behavioral 
outcomes, factors relating to the 
development and expression of expect- 
ancies, and finally, changes in cogni- 
tive structures and behavior with con- 
tinued intercourse with the college 
environment. The influence of theo- 
rists such as Bruner (1957), Kelly 
(1955), Lewin (1936), and Tolman 
(1951) upon the present development 
is apparent. 

The data reported here are those 
from the first in a series of studies 
examining student expectancies con- 
cerning withdrawal from college, their 
precollege ability, personality, and 
learning correlates, and changes in 
these expectancies attributable to se- 
lected situational inputs, for example, 
conditions of success or failure, col- 
lege environments, and so forth. It is 
intended that these studies extend 
over the 4-year college experience and 
include postcollege behaviors, for ex- 
ample, occupational stability and job 
satisfaction, as well. Examination of 
expectancies as expressed immedia- 
tely prior to college, including their 
precollege ability and personality cor- 
relates, and their change or resistance 
to change as a function of exposure 
to college may provide a fruitful ap- 
Pee to the study of college drop- 
out. 


Mertsop 


Construction of the Attrition Ez- 
pectancy Questionnaire 


To determine those reasons and areas of 
conflict which students judge as relevant to 
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college dropout, a sample of 1000 freshmen 
entering Penn State in the summer term 
of 1965 were administered a two-item, free 
response questionnaire. For both items the 
subjects (Ss) were told that 50% of the 
students who matriculate at Penn State 
fail to graduate. Item 1 simply asked S to 
list all “your reasons," if he were among the 
50% who withdrew. Item 2 asked S to 
think of a person whom he knew well who 
would probably be in the dropout group 
and list his reasons for withdrawing. The 
items were counterbalanced in their presen- 
tation. 

À content analysis of Ss' responses yielded 
some 33 relatively frequent and distinct 
perceived reasons for college withdrawal, 
many of which a sophisticated investigator 
would consider rather naive, for example, 
“can’t cope with pressures,” or “dislike the 
college.” These were precisely the “naive” 
verbalizations in which the authors were 
interested, however. 

These 33 statements of reasons for with- 
drawing from college, together with 10 
statements relating to actual withdrawal or 
transfer from college formed the Attrition 
Expectancy Questionnaire (AEQ). Each 
item was accompanied by a five-point sub- 
jective probability scale ranging from highly 
probable to highly improbable. The ques- 
tionnaire was introduced by telling Ss that 
roughly one-half of Penn State students 
fail to complete their program of study. 
The S was to respond to each item in terms 
of how probable that statement was with 
respect to his college behavior. 

Two sample items are: 

1. How likely is it that you might inter- 
rupt your college program? 


1 2 3 4 5 
highly highly 
improbable improbable neutral probable ^ probable 


2. If I withdraw it will be due to my 
poor high school preparation. 


1 2 3 4 5 
highly 

improbable improbable neutral probable 

Ability, Personality, and Learning 
Correlates 


Having focused attention upon student 
expectancies, a legitimate question to ask 
is how these expectancies are shaped. Most 
investigators of college dropouts agree that 
one important factor involves the broad 
area of family dynamics, particularly pa- 
rental attitudes and behaviors (Brown, 


ighly 
probable 
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1963). Some data are available which sug- 
gest that certain parental attitudes and be- 
haviors as perceived by the student are re- 
lated to the student's college performance 
(Malloy, 1954; Sexton, 1965; Trent, Athey, 
& Craise, 1965). For the present study, two 
scales reflecting two important aspects of 
the parent-student relationship were con- 
structed: The first, parental attitude, as- 
sessed the degree of parental acceptance or 
rejection the student experienced, while the 
second, parental press, evaluated the per- 
ceived influence the parents exerted upon 
the student's behavior. 

The effects of family dynamics as well 
as other socializing forces, for example, cul- 
tural values, peer behaviors, etc, can be 
extended to more specific cognitive struc- 
tures and motivational states. Three such 
areas felt to be important were fear of fail- 
ure, level of aspiration, and educational 
values. Again, there is some, though slight, 
evidence that these variables are related to 
academic performance (Hancock & Teevan, 
1964; Moulton, 1965; Munger, 1956). Three 
true-false scales were constructed for each 
of these variables. Together, the five scales 
were intended to cover a significant portion 
of the range of parental and social influ- 
ences upon the students’ educational ex- 
pectancies. 

Two other classes of variables important 
to the formation of these expectancies were 
also included in the analysis, Intellectual 
functioning would seem to be relevant in 
two ways. First, the brighter student simply 
has a better chance of meeting the academic 
requirements of his college program. Second, 
and more subtle, the brighter student would 
presumably have the skills needed for a 
more effective and elegant management of 
his interpersonal and other environmental 
relationships. As such, one might expect a 
positive correlation between a measure of 
scholastic ability and expectancies of col- 
lege achievement and success. 

. The set of environmental outcomes, par- 
ticularly those relating to the student's 
academic performance, is the second set of 
interest. During high school, the student 
has experienced certain academie and career- 
relevant feedback which falls broadly under 
the rubric of success-failure. This environ- 
mental confirmation or  disconfirmation 
would serve either—in the former case— 
to strengthen expectancies, or to induce 
cognitive realignment or change in the 
latter case (Bruner, 1957). A measure of 
high school academic achievement would 
provide some indication of this feedback. 


Procedure 


The 43-item AEQ and the five scales 
reflecting level of aspiration, fear of failure, 
educational values, and parental attitudes 
and press were administered to a random 
sample of Penn State freshmen immediately 
prior to their registration for the first term 
of work, In addition to these measures, the 
SAT verbal, quantitative, and total scores, 
high school average, and a measure of high 
school quality—the percentage of students 
in the high school graduating class going to 
college—were obtained for each student. 
This latter variable was included to evalu- 
ate and control for possible differences in 
quality among the high schools from which 
the Ss came (Marks & Murray, 1965). 

The first term and third term cumula- 
tive GPAs as well as the dichotomized 
score—either 0 or 1—indicating withdrawal 
or nonwithdrawal from the university after 
1 year of study were included. 

Several related analyses were undertaken 
on these data, First, the 43 AEQ items were 
intercorrelated and the pattern of correla- 
tions examined for a smaller number of 
explanatory dimensions of college dropout. 
It was originally intended to factor this 
correlation matrix, but upon examination 
the joint probability density function of 
these variables was so obviously nonnormal 
that the results of a factor analysis would 
be highly questionable. As a result, “fac- 
toring” was done by inspection. 

Second, and again using the entire sam- 
ple, the correlations between the 43 AEQ 
items and the other 13 variables as well as 
the intercorrelations among these latter 
variables were computed and examined, 

To determine differences in the frames 
of reference between those Ss reporting a 
high probability of withdrawing from col- 
lege and those reporting a low probability, 
means, standard deviations, and covariances 
of the 43 AEQ items and 13 variables were 
computed separately for the two groups. 
Special interest was directed toward possi- 
ble differences between corresponding cor- 
relation values. 1 

Finally, the sample was agam broken 
down on the criterion—withdrew from or 
remained in college—and the means, stand- 
ard deviations, and covariances of all vari- 
ables computed for the two groups. | 

The reason for examining covariance 
matrices separately for both the attrition 
and nonattrition groups and the probable 
and nonprobable attrition groups was to un- 


cover any possible nonadditive effects re- 
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lated to the two variables, expected dropout 
and actual dropout. 


Subjects 


The Ss were 300 students selected ran- 
domly from the entire freshman class ma- 
triculating at Penn State in the fall term of 
1965. There were 229 males and 71 females. 
All were tested prior to registration for their 
first term, so that none had experience with 
the college academic environment and little, 
if any, contact with its social environment. 


RESULTS 

An incidental but interesting find- 
ing concerned the attribution of drop- 
out as elicited by the two initial free 
response items. When referring to 
themselves Ss spoke mostly of exter- 
nal or personally acceptable causes, 
€. poor preparation, lack of funds, 
present school not first choice. Rea- 
sons given for the dropout of another 
person were imputed almost entirely 
to personal weaknesses, e.g., imma- 
turity, maladjustment, lack of self- 
discipline. Apparently, certain de- 
fensive strategies are operative in the 
responsivity to such items, a finding 
which has implications for the post- 
withdrawal type of attrition ques- 
tionnaire. 

The results of the crude clustering 
of the correlations among the 43 AEQ 
items—crude in the sense that a pre- 
cise analytic procedure was lacking— 
suggested three major dimensions ac- 
counting for the correlations. For this 
clustering only correlations greater 
than |.30| were used. One dimension 
of perceived college withdrawal in- 
volved academic skills and utiliza- 
tion of such skills, Items "loading" 
high on this dimension were lack of 
ability, poor study habits, inability to 
concentrate, poor high school prep- 
aration, and laziness. A second rather 
clear-cut dimension related to moti- 
vation. Describing this dimension 
were such items as want to do other 
things, prefer to “bum around,” no 
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reason for attending college, lack of 
occupational or career goals, and 
again laziness. The third and most 
extensive dimension involved what 
was simply called adjustment. Items 
defining this dimension were inability 
to handle independence (the main 
campus of Penn State is attended al- 
most entirely by resident students), 
immaturity, homesickness, inability 
to cope with academic and social pres- 
sures, can’t get along with other stu- 
dents, academic frustration, disillu- 
sionment, inability to conform, and 
wanting to go to a smaller school. 

These dimensions did not appear to 
be completely distinct, there being 
some slight overlap of items. In gen- 
eral, however, the three areas seemed 
clearly defined. Two minor causes of 
dropout which also emerged from this 
analysis were social or nonacademic 
overinvolvement, and leaving for a 
job or marriage. 

The correlations between the 43 
AEQ items and the 13 ability, per- 
sonality, and performance variables 
were generally rather low (although 
many reached the r value of .14 
needed for significance at the .01 
level), except for the level of as- 
piration, parental attitude, and fear 
of failure scales. 

Students who reported a high level 
of aspiration felt it was highly un- 
likely that they would encounter dif- 
fieulty or leave school because of lack 
of occupational or career goals, poor 
motivation, lack of self-discipline, in- 
ability to concentrate, poor study 
habits, lack of ability, laziness, or a 
general unreadiness for college. In ad- 
dition, such students reported that it 
was highly unlikely that they would 
interrupt. their college studies or 
transfer to another school. 

On the other hand, students who re- 
ported a high fear of failure felt that 
if they did drop out of college it would 
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likely be due to an inability to con- 
centrate, inadequate high school prep- 
aration, inability to cope with aca- 
demic pressures or teaching, academic 
frustration, and lack of occupational 
and career goals. 

Students who reported their parents 
as accepting and supporting felt it 
was unlikely that they would inter- 
rupt their college career because of 
inadequate ability, poor study habits, 
lack of self-discipline, insufficient mo- 
tivation or career goals, or laziness. 

Two other correlations are also 
worth noting. Neither the brighter 
students (SAT) nor the higher-achiev- 
ing students. (GPA) felt that inade- 
quate high school preparation was a 
probable source of college withdrawal. 

For all comparisons a correlation 
coefficient. of |.20| or higher was re- 
quired. Correlations of the AEQ items 
with withdrawal-nonwithdrawal are 
reported separately in the section re- 
lating to actual dropout. 


Differences between Probable and 
Nonprobable Dropouts 


Of the 300 students examined, 35 
reported that it was highly probable 
or probable that they would discon- 
tinue their studies at Penn State. 
Ninety-seven students reported that 
it was highly improbable that they 
would discontinue. 

„As might be expected, there was a 
significant difference between the 
mean vectors of the remaining 42 
AEQ items for these two groups, the 
T? value being significant at the .05 
level. For every item, the probable 
group reported a higher probability 
of withdrawing for the stated reason. 
Those items most clearly differen- 
tiating the two groups covered only 
two of the three dimensions previ- 
ously mentioned, including specifi- 
cally poor motivation, inability to 
work to potential, laziness, frustra- 


tion, lack of occupational or career 
goals, wanting to “bum around,” ex- 
ternal attractors such as employ- 
ment, marriage, or the Army, etc., 
immaturity, not knowing “why I’m 
here,” and a general inability to ad- 
just to or cope with academic pres- 
sures. The area where differences were 
much smaller, if not negligible, was 
that of academic and work skills and 
their utilization. i 

The means, standard deviations, 
and intercorrelations among the 11 
(third term cumulative GPA was 
deleted because of numerous zero val- 
ues) ability, personality, and per- 
formance variables are reported sep- 
arately for the two groups in Table 
1. 

Only three of the mean differences 
were significant at the .05 level. Stu- 
dents assigning a low probability to 
their possible withdrawal from college 
obtained higher level of aspiration 
and educational values mean scores 
than those students assigning a high 
probability to this behavior. Most in- 
teresting, the. nonprobable dropout 
group also had a significantly higher 
mean withdrawal score. This value re- 
flects the number of dropouts—the 
higher the value the fewer dropouts. 
Transforming these means to per- 
centages indicated that 30% of those 
students reporting probable dropout 
withdrew within three terms whereas 
only 10% of the nonprobable group 
withdrew. : 

Examination of the correlations 
indicated that the brighter students, 
as usual, performed better in college, 
performed better in high school, and 
attended the better high schools, In 
addition, they tended to report low 
fear of failure. The higher-achieving 
students, aside from being brighter 
and having better high school records, 
also reported their parents as being 
more accepting and supporting than 
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did the lower-achieving students. 


ss High school average and parental 


ut HSA % College 


Parental 


attitude 


Parental 


attitudes were also positively corre- 
33 lated. Students who attended the 
2 better high schools tended to report 
higher educational values and lower 
fear of failure, while parental atti- 
tude, already shown to be related to 
high school and college achievement, 
was negatively correlated with per- 
ceived parental press. 

Although a thorough examination 
of the nonadditive effects of probable 
or nonprobable withdrawal would re- 
quire testing the equality of the com- 
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Level of 
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TABLE 1 
AMONG THE ELEVEN ABILITY, 
Fear of 
SATT | term GPA| Fear of 


PROBABLE AND NoNPROBABLE DROPOUT 


mitted. 


SAT V 


Means, STANDARD DEviaTIons AND ÍNTERCORRELATIONS* 


SAT 2 

SAT 

1st Term GPA 
Fear of failure 
Educational values 
Level of aspiration 
Parental attitudes 


plete  variance-covariance matrices 
of the two groups (Anderson, 1958), 
only a few of the pairs of correlations 
were of interest, in particular those 
involving GPA and the personality 
variables. Tests of significance were 
made on selected pairs of correlation 
coefficients by use of the test criterion 
W = X (m — 3) (Z; — Z)? (Gray- 
bill, 1961).? Those correlations whose 
differences were significant at the .05 
level are underlined in Table 1. 

The negative correlation between 
fear of failure and GPA, and the 
positive correlation between educa- 
tional values and level of aspiration 
for the probable dropout group are to 
be compared with the essentially zero 
correlations between these variables 
for the nonprobable group. Most 
striking is the positive correlation be- 
tween fear of failure and parental 
attitudes for the probable group, 
while this correlation is negative for 
the nonprobable dropouts. 


in parentheses. For the probable group, N = 35; for the nonprobable dropout group, N = 97. 


Differences between Dropouts and 
Nondropouts 


Of the 300 students studied, 39, or 
13%, withdrew during the first three 
terms. The remaining 261 students 
completed at least three terms of 
study. 


he probable dropout group are gi: 


lecimal points for intercorrelations have been o; 


Note.—The values for t! 
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The most striking differences be- 
tween the two groups on the 43 AEQ 
items involved the perceived chances 
of discontinuing studies and, second, 
transferring to another school. In 
both cases, students who withdrew as- 
signed a higher probability to those 
two outcomes than did the students 
who were still enrolled. Again, the 
mean vectors were significantly dif- 
ferent. Those items most clearly dif- 
ferentiating the two groups were poor 
high school preparation, lack of occu- 
pational or career goals, and six items 
relating to adjustment. As with the 
probable-nonprobable dropout dimen- 
sion, withdrawal due to academic abil- 
ity and skills and their utilization was 
not considered relevant. It should be 
recalled, however, that the probable 
and actual dropout groups were not 
independent, roughly 30% of the 
Rr actually withdrawing from col- 
lege. 

The means, standard deviations, 
and intercorrelations among the 12 
ability, personality, and performance 
variables are reported separately for 
the two attrition groups in Table 2. 

There were several significant 
mean differences. Students who re- 
mained in college had higher SAT 
scores, higher high school averages, 
higher first term GPA’s, and also 
had higher level of aspiration and 
parental attitude mean scores, On 
the other hand, dropouts reported 
higher feelings of fear of failure. 

Nonadditive effects of college drop- 
out were again tested by examining 
selected pairs of correlation values. 
Significant differences are underlined 
in Table 2. Students withdrawing from 
college and reporting a high fear of 
failure reported higher educational 
values and tended to come from the 
better high schools. For those students 
remaining in college these correlations 
were both negative. Again, for the 
withdrawing students the correlations 


between level of aspiration and par- 
ental attitudes, and level of aspiration 
and percent going to college were 
negative, while for the remaining stu- 
dents these values were low and posi- 
tive. Although high parental press for 
the dropout group was accompanied 
by high educational values and high 
high school average, these correlations 
were low and negative for the remain- 
ing students. There was a high nega- 
tive correlation between parental at- 
titude and press for the remaining 
students, while for the dropouts this 
value was zero. 


Discussion 


It is apparent from the present 
data that students arrive at college 
with certain rather well-formed per- 
ceptions and cognitions relating to 
their college attrition behavior, and 
that these expectancies, as hypothe- 
sized, have selected precollege and 
postcollege admission correlates. In 
addition, these expectancies show 
considerable individual differences. 

At a general level, the student in 
the post-high school—precollege in- 
terval rather simply perceives the 
sources of college dropout as falling 
in a three-dimensional space whose 
defining vectors were labeled, (a) 
academic and work skills and their 
utilization, (b) motivation, and (e) 
adjustment or coping behaviors. In 
describing their own behavior, how- 
ever, few students were willing to 
identify themselves with the first 
vector, and probably with good cause 
Since most colleges and universities 
are unwilling to accept students of 
low scholastic ability. 

At the same level it was indicated 
that the strength and direction of 
expectancies involving dropout and 
its antecedents were related to level 
of aspiration, fear of failure, and 
parental attitude. Without getting in- 
volved in questions of cause and ef- 
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fect, these correlations suggest that 
the student maintains a balance 
(Heider, 1958) or congruence (Fes- 
tinger, 1957) among the elements of 
his cognitive and motivational do- 
mains. As such, attrition expectancies 
can, in part, be viewed as a cognitive 
matching between salient motiva- 
tional states, for example, fear of fail- 
ure, etc., and relevant situational out- 
comes, such as academic performance. 

The role of parental figures both 
in the development of educational 
expectancies and in student aca- 
demic achievement (behavior) can- 
not be underestimated. Clearly, the 
environment established by the par- 
ents and the student’s involvement 
with this environment affects the 
student’s perceptions of and charac- 
teristic modes of dealing with extra- 
family environments. Higher high 
school and college achievement, de- 
spite an absence of correlation with 
ability, attest to the facilitating 
properties of parental attitudes. 

Quite unexpected was the lack of 
relationship between attrition expec- 
tancies and scholastic ability and 
previous educational experiences, for 
example, high school average and 
high school quality. What did emerge, 
however, was a pattern of correla- 
tions among the ability, personality, 
and performance variables which was 
suggestive of a reciprocal reinforce- 
ment by the parental and school en- 
vironments. Students’ educational 
values are apparently mutually rein- 
forced and shaped by the joint efforts 
of parents and school; a not so sur- 
prising notion which offers the in- 
teresting research possibility of study- 
ing descrepancies between these two 
behavioral domains. 

Turning to differences between 
students reporting high and low 
probabilities of college dropout, it was 
indicated that the probable group 
obtained lower level of aspiration 


and educational values mean scores. 
These students are apparently less 
committed to education and obtaining 
a degree than are the improbable 
dropout group. More interesting, 
however, are those correlations de- 
scribed previously which are sig- 
nificantly different for the two 
groups. This pattern of correlations 
suggests that the probable dropout 
group is much more concerned 
about satisfying the expectancies of 
their parents and that failure to do 
so might be anxiety and guilt pro- 
ducing. For those members of this 
group who view their parents favor- 
ably, that is, high parental attitudes 
and low press, the conflict arises in 
relation to their lack of commitment 
to educational pursuits. The motiva- 
tional properties associated with 
this conflict state apparently lead to 
an initially adequate academic per- 
formance and perhaps even over- 
achievement. It is questionable 
whether resolution of this conflict 
would continue to have this effect 
upon performance, in fact, the mean 
third term cumulative GPA of the 
remaining students in this group was 
considerably lower than that of the 
corresponding nonprobable dropout 
group mean. Those students in the 
probable dropout group who report 
low parental attitudes and high 
parental press are among the earliest 
poor achieyers and dropouts. 

In terms of the prediction of col- 
lege dropout, there is a significant 
relationship between a student’s ex- 
pectancy of, and his actual attrition 
behavior. As to magnitude, fully 
three times as many students in the 
probable dropout group withdrew 
through three terms as did students 
in the nonprobable group. If you 
want to know whether a student is à 
potential college dropout, a good 
starting place is simply to ask him. 
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The reasons why he says “yes” are 
indeed more complex. 

The final aspect of the present 
study was a comparison of the drop- 
out and remaining students after 
three terms of work. As frequently 
recorded, the dropout demonstrates 
both lower ability and poorer high 
School performance. In addition, the 
dropout group had a lower mean 
first term GPA, this value being below 
the minimum required for graduation 
from the university. As with the 
probable-nonprobable dropout dimen- 
sion, level of aspiration was signifi- 
cantly related to withdrawal. Parental 
attitudes continued to be related to 
performance criteria, being positively 
correlated with GPA and persistence. 
A higher fear of failure mean score 
was also observed for the dropout 
group. These data complement those 
of Pervin and Rubin (1965) relating 
to discrepancies between self and en- 
vironment and subsequent dropout 
behavior. 

Together with the pattern of sig- 
nificantly different correlations ob- 
tained here, the data from these 
studies suggest that students dropping 
out of college have difficulty in re- 
solving conflicts involving commit- 
ment to educational behaviors, drives 
toward parental and career satisfac- 
tion, and discrepancies between cog- 
nitions concerning self and environ- 
mental feedback. That these students 
are aware of the existence of these 
conflicts and their possible outcomes 
—although they may be unaware of 
the dynamics involved—is indicated 
by the dropout students’ precollege 
expectancies. 

Further studies in this series will 
examine changes in precollege expect- 
ancies as a result of environmental 
interaction, and their relation to long 
range criteria related to educational 
and career success. 
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LEARNING OF LOGICAL CONNECTIVES BY 


ADOLESCENTS WITH SINGLE AND 
MULTIPLE INSTANCES! 
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Adolescents classified with correction a set either of single or of multi- 
ple instances according to 3 logical concepts: conjunction, exclusive 
disjunction, and conjunction with negated elements. Responses were 
symbolic expressions composed of alphabet symbols standing for 
elements and other symbols standing for logical connectives, On a 
subsequent noncorrected criterial task new instances of these concepts 
were presented as well as instances of novel problems of logical addi- 
tion and simplification. No quantitative difference was observed in the 
attainment series between Ss classifying the single or the multiple sets. 
Ss who classified multiple instances initially, however, showed greater 
success in the criterion task including performance on the novel prob- 
‘lem of addition. Also, Ss in the multiple instance condition showed 
patterns of success distinct from those evidenced by Ss dealing with a 
single instance set. It was concluded that training with multiple in- 
stances enhanced rule learning of the logical connectives and reduced 


point-to-point matching of instances with symbols. 


The present study was designed to 
assess adolescents’ attainment of 
combinatorial rules defined by the 
logical connective—"and," “not,” and 
"or"—and to investigate the degree 
of mastery of these logical expres- 
sions. Following the general pro- 
cedure of Youniss and Furth (1964), 
a concept acquisition and subsequent 
criterial series were constructed in 
Which adolescents had to match in- 
stances to symbolic expressions. Since 
elements of the instances were de- 
fined beforehand, subjects (Ss) had 
only to attain the meaning of the 
logical connectives within each sym- 
bolic expression. This procedure fol- 
lows a rule-learning paradigm (ef. 
Haygood & Bourne, 1965) rather 
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than an “attribute discovery” pro- 
cedure in which Ss must discover 
relevant elements as well as the com- 
binatorial rules. Two major modifica- 
tions of the previous study were: (a) 
a comparison of attainment with 
single and with multiple instances of 
correctness and (b) an extension of 
criterion trials to include novel types 
of problems not seen during acquisi- 
tion but which could have been 
learned implicitly during the acqui- 
sition series. 

The first major analysis in the 
present study compared performance 
between Ss classifying each concept 
according to one particular instance 
with performance in which each con- 
cept was classified according to at 
least two distinct instances. In the 
condition with the single concept in- 
stance for example, only one instance 
showing two elements was ever cor- 
rect for conjunction. With multiple 
correct instances, more than one n- 
stance, each showing a different pair 
of elements, was correct for conjunc- 
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tion. To illustrate, in the single-in- 
stance condition, the one correct in- 
stance for conjunction showed the 
colors red and blue together, while in 
the multiple-instance condition the 
instance showing red and blue and 
the instance showing a man and a 
house together were both correct 
for conjunction. 

This comparison of type of instance 
presentation was suggested by the 
observation that college students 
demonstrated greater concept-gener- 
alization after training with varied 
instances than after training with 
uniform instances (Turnure & Wal- 
lach, 1962). Since in the present 
study the number of exposures for 
matching instances with concepts was 
held constant, presentation of varied 
instances necessarily entailed a re- 
duction in the number of trials in 
which any particular instance was 
matched with a concept. On the other 
hand, showing one particular correct 
instance repeatedly might result in 
more stable acquisition. Thus, it was 
not obvious that results similar to 
Turnure and Wallach’s would be ob- 
tained with adolescents. 

The second modification of our 
previous study was designed to de- 
termine whether Ss implicitly learned 
the logical rules for addition and 
simplification during acquisition. 
This question could be answered by 
comparing performance with in- 
stances essentially similar to those 
presented in acquisition against per- 
formance with novel instances show- 
ing fewer elements (addition) or 
more elements (simplification) than 
were necessary to exemplify correctly 
a concept. 

This comparison was suggested by 
results with adolescents (Furth & 
Youniss, 1965) showing that ease of 
classification of some logical rules is 
contingent upon the physical compo- 


sition of instances. For example, it 
was found that for the exclusive dis- 
junctive rule, “either X or Y,” in- 
stances showing a single element were 
easiest to classify, instances showing 
one relevant and one irrelevant ele- 
ment were next easiest, and instances 
comprised of two irrelevant elements 
(denial) were hardest to classify. 
This result was consistent among 
five independent groups of Ss and 
suggests that adolescents do not nec- 
essarily learn logical rules apart from 
their concrete application. In the 
present design, evidence for rule 
learning would be obtained if Ss 
showed success with the problems of 
addition and simplification that pre- 
sented novel instances but followed 
the same rules for connectives as 
were learned during acquisition. 


METHOD 
General Design 


An acquisition series of 48 corrected trials 
presented 16 instances each of conjunctive 
(C), exclusive disjunctive (D), and conjunc- 
tion with negation (N) concepts. The Ss 
were exposed either to unitary instances, 
called single-instance condition or to two 
kinds of instances for each concept, called 
multiple-instance condition. The single-in- 
stance condition presented to approximately 
one-half the Ss instances comprised of two 
colors and to the other one-half instances 
consisting of two familiar figures, The con- 
dition of multiple instances presented alter- 
nately color and figure instances. Each in- 
stance was to be matched to one of three 
symbolic expressions—X * Y, X/Y, and X - 
X —3hich stood for C, D, and N concepts, re- 
spectively. The Ss were told which alphabet- 
symbols stood for which elements on the 
instance cards and that they were to learn 
the meaning of the symbolie expressions 
which stood for the connectives. 

Following the acquisition series 20 non- 
corrected criterial trials were given; four 
trials each which instanced C, D, and. 
concepts; and eight more trials which in- 
stanced the disjunctive class in new ways. 
‘All Ss performed the criterial series with in- 
stances of geometric forms which were not 
shown in the acquisition series. Again Ss 
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were told the meaning of the alphabet- 
symbols, but since correction was not given 
a situation was provided to test whether Ss 
had learned and could apply the logical con- 
nectives to which they had previously been 
exposed. 


Procedure 


The Ss were tested in their classrooms in 
groups of approximately 40 with the class- 
room teachers present serving as monitors. 
Each S was given a two-page handout. Sheet 
1, for the attainment series, contained 48 
trial markers; after each trial the three sym- 
bolic expressions were given. As these were 
being passed out, the experimenter (E) put 
on the blackboard the same symbolic expres- 
sions which appeared on the sheet as well as 
the elements of the instance cards which 
the alphabet-symbols represented. 

Single-instance condition with colors. In- 
stances used in this task were two color 
patches separated left-from-right on 5 X 8 
inch cards, Sixteen of these cards contained 
the colors red and blue; these cards in- 
stanced the C concept: “red and blue col- 
ors.” Sixteen other cards contained red and 
yellow, red and green, blue and yellow, or 
blue and green colors. These cards were cor- 
rect instances of the D concept: “either red 
or blue, but not both of these colors.” The 
remaining 16 cards were colored yellow and 
green, yellow and orange, green and yellow, 
or green and orange; these were instances 
of the N concept: “neither red nor blue.” 

On the answer sheet next to each trial 
number appeared the response alternatives R 
*B,R/B,and R - , the symbolic ex- 
pressions for C, D, and N, respectively. Left, 
middle, and right position of each alterna- 
tive was counterbalanced over the attain- 
ment series in this and in each of the other 
conditions described below. 

The Ss were told beforehand that “R” 
stood for red and that “B” stood for blue. 
Further, they were told to leam which 
“answer” (ie., symbolic expression) was the 
Correct one for each card they would be 
Shown. They were informed also that E 
would provide the correct answer for each 
card after everyone had circled his choice 
and raised his pencil. Correction was carried 
out by Z's pointing to the correct expression 
of the three placed on the blackboard. 

In this and all other conditions instances 
were presented singly in a restricted random 
order so that within six trial blocks two 
instances of each concept were presented. A 


random order following this restriction was 
designed for 48 trials. 

Single instance condition with figures. 
This condition was similar to that described 
above except that figures replaced colors 
with an appropriate change in alphabet sym- 
bols. The C concept in this condition was 
"man and house," which was expressed by 
M - H. “Either man or house, but not both” 
was the D concept which was expressed 
M / H. Finally the N concept of “Neither 
man nor house,” expressed M - H, com- 
pleted this condition of 48 trials with 16 
trials per concept. 

Multiple instance condition. This condi- 
tion, also consisting of 48 trials, presented 
both color and figure instances and required 
both types of symbolie responses. The 16 
trials illustrating each concept were divided 
equally between color and figure instances. 
Odd-numbered trials presented color in- 
stances and even-numbered trials presented 
figure instances. The same random order 
prevailed as was used above. In all other 
respects this condition resembled the single- 
instance conditions. 

Immediately following Trial 48 of the 
attainment series Ss were told to turn to 
Sheet 2 of their handout, beginning the 
criterial series. All Ss were then exposed to 
20 noncorrected trials. In this series geo- 
metric forms were used as instances and 
appropriate alphabet symbols along with 
the connectives used in attainment were 
responses. 1 

"Trials 1 to 12 of this series showed in 
random order four instances of C, D, and N 
concepts. For the C concept, the instance 
card showed a cirele and a square; the 
appropriate expression was C - S. The cor- 
rect instances for the D concept presented 
circle and triangle or a square and triangle 
which was expressed as C / S. The four N 
instances showed a triangle and ellipse, ex- 
pressed as C - S. The Ss were told before- 
hand that C stood for circle and S stood for 
square. 7 

The next four trials, called affirmation 
with simplification (AS), presented also dual- 
element instances comprised of geometric 
forms. They differed from all other trials 
in this experiment in that they instanced 
affirmation of a unitary class expressed by 
a single alphabet symbol. Of the two ele- 
ments comprising each instance, one 10- 
stanced the class while the other was 1r- 
relevant. To illustrate, Trial 16 showed the 
instance square and triangle with the re- 
sponse alternatives C, S, and S - T. The 
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correct response here was S (square); a 
unitary class instanced by the presence of a 
square and, incidentally, the presence of the 
irrelevant triangle. 

The last four trials of the noncorrected 
criterial series presented instances of ex- 
clusive disjunctive concepts which differed 
from D trials described above in that only 
single element instances were shown. We 
call these trials cases of disjunction with 
addition (DA). On each of these trials two 
D-type expressions were alternatives. To 
illustrate, the last trials of the series showed 
as the instance a circle with the response 
choices C, S / T, and C / S—the last being 
the correct response. Thus, in distinction 
from D and from AS trials only one element 
comprised an instance but like D and AS 
problems only one element, in fact, was 
necessary to satisfy the symbolic expression. 


Sampling and Subjects 


The adolescents selected for this experi- 
ment were taken from an initial sample of 
454 children enrolled in Grades 6 and 7 of a 
parochial school in suburban Philadelphia, 
Pennsylvania. The total sample was approxi- 
mately halved when Ss were divided into 
those failing and those succeeding to an 
arbitrary criterion in the attainment series. 
This criterion required three successive cor- 
rect responses on all three concepts consecu- 
tively; it could be achieved with from 9 to 
12 consecutive correct responses depending 
on the point at which it was attained in the 
series. 

Approximately equal percentages of 
Grade 6 and Grade 7 Ss succeeded, 46% and 
50%, respectively; hence it was considered 
appropriate to combine grade levels into a 
single adolescent group. 

Next an attempt was made to obtain 
equal size samples from conditions of single 


and multiple instances. Since 46% of the 
single-condition Ss and 50% of the multiple- 
condition Ss succeeded to criterion, it was 
concluded that neither of these conditions 
contributed disproportionately to failure 
rate; hence scores from Ss in these respective 
conditions were considered comparable. 

An additional sampling restriction was 
designed to control for speed of initial at- 
tainment. Thus, of all Ss achieving cri- 
terion, equal numbers were selected from 
those sueceeding in approximately the first, 
second, and third quarters of the attain- 
ment, series. 

The resulting present sample had a total 
of 96 Ss; 48 of those who had succeeded in 
single and 48 who had succeeded in multiple 
conditions. Further the 48 Ss in each condi- 
tion comprised 16 Ss who had reached cri- 
terion between Trials 1 and 12, 16 Ss who 
attained criterion between Trials 13 and 
22, and 16 Ss who succeeded between Trials 
23 and 38; the last trial in which an error- 
less criterion run could begin. These last 
three groups of Ss are hereafter referred to 
as fast, intermediate, and slow learners, 
respectively. Finally, within single-instance 
conditions one-half of the fast, slow, and 
intermediate groups was composed of Ss 
trained with color instances while the other 
one-half was exposed to figure instances. 


RESULTS 


Attainment series. Table 1 reports 
trials-to-criterion as a function of 
type of instance and speed of learn- 
ing and errors-to-criterion as a func- 
tion of these conditions and of con- 
cepts. It can be seen that selection 
according to speed of attainment re- 
sulted in three distinct groupings of 


TABLE 1 


Means or TRIALS- AND Errors-To-CRITERION IN ATTAL 


NMENT AS A FUNCTION OF 


EXPERIMENTAL CONDITIONS AND CONCEPTS 


Speed of attainment 


_ 
Fast 
Intermediate 

low 
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Ss and that trials-to-criterion scores, 
in each case, are comparable for sin- 
gle- and multiple-instance conditions. 

Error scores were used to evaluate 
relative difficulty among concepts C, 
D, and N. It is seen in Table 1 that 
in five of the six independent groups 
(3 speeds of learning X 2 types of 
instance) fewest errors were made 
with the C concept, while in four of 
the six groups D proved to be asso- 
ciated with the most errors. In order 
to take instance conditions into ac- 
count speed of learning was col- 
lapsed and two Friedman two-way 
analyses were made, one for single 
and one for multiple conditions. In 
both cases reliable differences among 
concepts were observed; the single 
condition resulted in x? = 5.45, df 
= 2, p < .05; and for the multiple 
condition yielded y? = 2448, df = 
2, p < .001. It can be seen in Table 
1 that these results reflect differential 
ordering among concepts due to type 
of instance condition. For single-in- 
stance Ss, C was easiest and D the 
most difficult while for multiple-in- 
Stance Ss C was easiest but N was 
the most difficult concept. 

To assess further these dissimilar 
orders of difficulty, Mann-Whitney 
U tests were computed between single 
and multiple conditions for each con- 
cept. None of these comparisons in- 
dicated a significant difference due to 
mode of condition. Thus, it was con- 
cluded that prior to criterion trials 
no quantitative differences between 
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single and multiple conditions were 
found except that these conditions re- 
sulted in differential orders of diffi- 
culty among concepts. 

Criterion series. Table 2 reports er- 
ror scores for the five concepts tested 
in this series as a function of the 
experimental conditions. An initial 
analysis was performed to determine 
overall effects of instance conditions 
and speed of initial learning on cri- 
terial performance. Errors were 
summed across concepts, yielding a 
possible range of scores from 0 to 20, 
and a 3 X 2 analysis of variance was 
computed. Single versus multiple con- 
ditions proved to have a reliable ef- 
fect with F = 12.08, df = 1, 90, p 
< .001; and speed of acquisition also 
was a reliable source of variance with 
F = 10.79, df = 2, 90, p < .001; and 
their interaction resulted in a non- 
reliable F < 1. Thus, facility with 
which criterion was reached in the 
attainment series continued to show 
an effect during criterial trials, even 
though all Ss were exposed to 48 cor- 
rected trials. On the other hand, 
while instance conditions failed to 
produce significant effects in attain- 
ment trials, a clear difference between 
them is observed in the criterial series 
with Ss exposed to multiple instances 
making consistently more accurate 
identification of new instances than 
Ss trained in the single condition. | 

The second analysis of criterial 
performance measured differential 
difficulty among concepts. It is seen 


TABLE 2 
MEAN ERRORS PER CONCEPT IN THE CRITERION SERIES as A Function OF CONDITIONS 
AND CONCEPTS 


Speed of Single instance Multiple instance 
attainment 
C | D | N | AS | DA | collapsed] C] D | N | As | DA | collapsed 
Fast d 5| .8|1.4|2.5|1.6 6.8 1. E .5|1.8]|1.1 4.0 
Intermediate .6|1.6|2.1|2.4| 2.2 8.9 .2 8 9 | 2.1) 1.0 5.0 
Slow 1.1 | 2.0 | 2.0 | 3.0 | 2.6 | 10.7 -8 | 1.6 | 1.7 | 2.9 | 2.0 9.0 


| 
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TABLE 3 
PERCENTAGES OF SUBJECTS SUCCEEDING ON EacH CoNcEPT 
Speed of Single instance Multiple instance 
attainment 

c D N AS DA c D N AS DA 

Fast 88 75 69 25 38 94 88 
y 1 

Intermediate 81 25 | 31 | 25 19 94 | 63 Fj Al 4 
Slow 69 25 38 13 13 75 44 44 19 25 
Collapsed 79 42 46 21 23 88 65 65 33 50 


in Table 2 that a clear consistency 
is obtained across the six independent 
groups with fewest to most mean 
errors following the order: C, D, N, 


ó DA, AS—with two slight excep- 


tions—these two cases showing equiv- 
alent errors in D and N. 

Because of the stability of this 
ordering a further attempt was made 
to measure interconcept differences. 
With possible errors per concept 
ranging only from 0 to 4, an arbi- 
trary score of success was employed. 
Success with a concept was con- 
sidered when an S made either one 
or no errors in the four trials in 
which that concept was presented; 
two or more errors then was con- 
sidered failure, Table 3 reports per- 
centages of success as a function of 
Conditions using this arbitrary cri- 
terion of success. 

With speed of attainment collapsed 
Comparisons of numbers of Ss suc- 
ceeding were made between single and 
tpl conditions for each concept. 

hree of the five comparisons proved 
to be reliable although with all five 
Concepts greater numbers of Ss suc- 
eae with multiple-instance training 

an with single instance. For C a 
reliable 3? = 4.66, df = 1, p < .05, was 
sptained; for D a reliable x? = 4.18, 
x = 1, p < .05, was observed; and for 

A ay = 6.47, df — 1, p « .02, was 
obtained; the result for N approached 
E PASE with x? = 2.70, df — 1,» < 


In order to answer the further 


question whether adolescents succeed 
by matehing elements of instances 
with elements of symbolic expressions 
or whether they succeed by applying 
rules, performance was measured ac- 
cording to success on all three of the 
original concepts. While C and N 
allowed for point-to-point matching, 
D did not, since it was always in- 
stanced by the presence of one relevant 
and one irrelevant element. That is to 
say, one could succeed on C or N by 
merely matching each symbol in an ex- 
pression with the two elements of an 
instance without taking into account 
the conjunctive connection. For D, on 
the other hand, the connective had to 
be applied since one of the two symbols 
had no counterpart in either of the 
elements in an instance. Thus, if an S 
succeeded with all three concepts then 
it would follow that matching alone 
would not account for his performance. 

Among Ss trained in the single con- 
dition 35% succeeded on all three 
concepts while among those in the 
multiple condition 56% succeeded on 
C, D, and N. Thus it can be con- 
cluded that more Ss in the multiple 
than single groups were not merely 
matching (2 = 420, df = 1, P < 
05) and further that approximately 
one-half the Ss succeeded probably 
by use of rule application. 

A further analysis of criterial per- 
formance compared relative difficulty 
among D, DA, and AS trials. Sign 
tests were used to determine whether 
one concept compared to another re- 
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TABLE 4 


PERCENTAGES or Errors TO VARIOUS ÅL- 
TERNATIVES IN Four TRIALS WITH Eacu 


Conczrr 
Concept | Alternatives set DS 
C X/Y 40 59 
my 60 41 
D XY 40 55 
X.Y 60 45 
N Dan's 20 12 
X/Y 80 88 
AS Y 10 14 
X.Y« 52 52 
X.Y 39 34 
DA Xe 68 64 
X/Y" 1 13 
X-.Y 21 23 


a An alternative on only two of the four 
trials. 


sulted in more, less, or the same num- 
ber of errors. Absence of a difference 
in errors between D and each of the 
other concepts could be taken as evi- 
dence that Ss had learned implieitly 
during D-trials the exclusive dis- 
junetive rule. Scores from single and 
multiple-instance groups were evalu- 
ated separately; thus, six Sign tests 
were made. Only one of these re- 
sulted in a nonsignificant difference, 
that being for the comparison be- 
tween D and DA concepts for multi- 
ple-instance Ss (p > -10) indicating 
ae success on D and DA for these 

s. 

The final aspect of criterion per- 
formance to be considered was an 
analysis of kinds of errors or mis- 
identified instances made per con- 
cept. With C and D concepts, errors 
were distributed equally between the 
two alternatives as can be seen in 
Table 4. For N, AS, and DA, how- 
ever, disproportionate distributions 
among response alternatives can be 
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seen. For N the dominant error was 
to respond with D but only rarely 
with C. 

With AS the error distributions in- 
dicated_a disproportionate response 
to the X - Y alternative, which was 
presented on two trials and accounted 
for 52% of all errors made in the four 
trials on which this concept was pre- 
sented. Recall that these AS in- 
stances showed dual elements which 
in fact instanced more than was nec- 
essary for single affirmation. Only 
10% and 14% of the errors in single 
and multiple conditions, respectively, 
were made with a response which 
presented only one symbol. These two 
results taken together indicate that 
Ss who did not understand affirma- 
tion were most likely attempting to 
match a dual-element instance with 
a dual-symbol expression. 

These data agree with error analy- 
ses of DA which showed a majority 
of errors with dual-element symbol 
expressions rather than with unitary 
symbols. In other words, of the DA er- 
rors made, the dominant tendency was 
to match a unitary element with a 
single symbol even though this sym- 
bol was the negation of the element 
shown in the instance (e.g., circle a8 
the instance with C). 


Discussion 


This investigation was undertaken 
to study characteristics of adoles- 
cents’ concept attainment under two 
modes of instance presentation. The 
multiplicity of instances to which Ss 
were exposed in the attainment series 
proved to be an important factor de- 
termining success in identification of 
new instances in the criterial series, 
as multiple-instance exposure led to 
greater success on four of the five 
criterial concepts than single-instance 
exposure. A large proportion of this 
success might be attributed to the 
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learning of rules for the logical con- 
nectives, but some success seems due 
to a matching strategy in which Ss 
compared elements of instances with 
symbols of logical expressions. 

Evidence for rule learning of con- 
nectives was inferred from the pat- 
tern of success for Ss showing cor- 
rect identification of C, D, and N, as 
well as from their equivalent success 
on D and DA trials, Across all S 
groups a consistent ordering of con- 
cepts from least to most difficult was 
C<D=N « DA < AS. For ap- 
proximately one-half of the Ss—56% 
and 35% of the multiple and single 
instance groups, respectively—this 
ordering was modified with C, D, and 
N being equally easy. This pattern 
of success can be seen as the result of 
Ss having learned relational rules for 
the three connectives while it cannot 
be attributed to the sublogical proc- 
ess of matching. This is to say, with 
attributes and alphabet symbols de- 
fined, an S could have succeeded on 
(0; and N trials by matching point-for- 
point elements of instances with sym- 
bols of expressions. With D trials, in 
distinction, each instance contained 
one element which had no symbolic 
equivalent in the expression; like- 
wise, each expression had one symbol 
for which there was no element pres- 
ent. Since C, D, and N comprise a 
logical set while they are nonequiva- 
lent from the matching standpoint, 
Success with all three operations 
Would seem to be indicative of rule 
learning. 

A second line of evidence for rule 
learning is seen in performance on 
DA trials relative to D trials by Ss 
Sven multiple-instance training. Re- 
call that DA trials were unique in 


I$ experiment in their presentation - 


of instances containing only one ele- 
ment. As with D, a matching strategy 
Would be inadequate for DA. In spite 
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of this drawback and the novelty of 
procedure no more errors were made 
on DA than on D trials by Ss per- 
forming with multiple instances. Ap- 
parently these Ss had learned the 
rule for the D connective and were 
as able to apply it to dual-element 
instances as to novel unitary in- 
stances. 

Further evidence for rule learning 
might have been adduced had success 
on AS trials not been so minimal. 
The error analysis for AS trials in- 
dicated a strong tendency to attempt 
a match, even to the extent that a 
denial of a present element was re- 
sponded to if the expression contain- 
ing the denial had two symbols. This 
preponderance of attempts to match 
need not be interpreted as incongru- 
ous with rule learning, however, since 
success on DA was probably effected 
through the rule for the D-connective 
while in AS trials no connective was 
given. 

These data coneur with previous 
Observations on concept attainment 
reported for college students. En- 
hanced rule-learning following multi- 
ple-instance training agrees with 
data presented by Turnure and Wal- 
lach (1965) who found greater gen- 
eralization to new instances after 
varied than after unitary instance 
training. Our data agree with their 
interpretation, that the constant crit- 
ical feature in a set of instances, in 
our case the connective, is likely to 
be differentiated from its surround- 
ings more than a stimulus occurring 
repeatedly in the same context. In 
our procedure varied instances made 
explicit the relational features of the 
connectives while behavior relatively 
more tied to explicit content was pro- 
duced with single instancing. 

Finally, the results resemble those 
reported by Haygood and Bourne 
(1965) for college students. Their Ss 
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were submitted to a series of instance 
identifieation tasks with attributes 
defined beforehand. As the series pro- 
gressed, rule-learning as measured by 
intertask improvement became evi- 
dent. Haygood and Bourne employed 
concepts equivalent to C, D, and N 
of the present study and observed 
"maximum" rule learning by the sec- 
ond or third problem. Their study 
varied instances in successive tasks 
while ours did so in one task, but the 
performance of our adolescents is 
similar to their college students. The 
major difference between results of 
the two studies would, however, show 
that not all our adolescents are suc- 
cessful rule learners while from Hay- 
good and Bourne’s data (cf. Table 3, 
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p. 183) it appears that most if not 
all college students succeeded by the 
third series. 
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BEHAVIOR 
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A 4Vs-yr.-old boy with excessively short span of attention was helped 
to acquire more extended attending behavior through the systematic 
programming of contingencies for adult social reinforcement. When 
the child remained with a single activity for 1 continuous minute, 
teachers immediately gave attention and approval for a long as he 
remained with that activity. Teachers withheld their attention conse- 
quent upon all other behavior. Within 7 days the number of activity 
changes decreased markedly. Reversal of these procedures reinstated 
the hyperactive behavior. When original reinforcement contingencies 
were reintroduced, there was again a marked decrease in number of 
activity changes. The study gives evidence that adults can help a child 


to increase his attending behavior, a crucial aspect of learning. 


There exists now a series of ex- 
perimental field studies applying re- 
inforcement principles to problem 
behaviors of preschool children. These 
studies have dealt with crying (Hart, 
Allen, Buell, Harris, & Wolf, 1964), 
Tegressive crawling (Harris, Johns- 
ton, Kelley, & Wolf, 1964), isolate 
play (Allen, Hart, Buell, Harris, & 
Wolf, 1964), passivity (Johnston, 
Kelley, Harris, & Wolf, 1966), non- 
Cooperative behaviors (Hart, Reyn- 
olds, Brawley, Harris, & Baer, 1966), 
self-mutilative scratching (Allen & 
Harris, 1966), autistic behavior 
(Brawley, Harris, Peterson, Allen, 
& Fleming, 1966; Wolf, Risley, & 
Mees, 1964) and classroom disrup- 
tiveness (Allen, Reynolds, Harris, & 
Baer, 1966). In each instance, the be- 
havior under examination was highly 
Tesponsive to adult social reinforce- 
ment. The present study was con- 
ducted to ascertain whether similar 
Social reinforcement procedures could 
alter the hyperactivity of a 4-year- 
old boy who tended to flit from activ- 
ity to activity. 


Attending behavior, commonly re- 
ferred to as “attention span,” has 
long been recognized as a crucial and 
desirable alternative to hyperactivity. 
What has not always been clear is 
the extent to which attending is a 
behavior which teachers can help a 
child to develop, although Patterson 
(Patterson, Jones, Whittier, & Wright, 
1965) has done work in this area with 
older children. Thus it is of interest to 
determine if systematic social rein- 
forcement can increase the duration of 
a young child’s attending to an activ- 
ity, and also to analyze the successive 
steps a teacher might take in helping 
a child to maintain his attention to an 
activity for increasingly long periods. 

One of the ultimate objectives of 
preschool education is, of course, to 
develop a child’s skills in using ma- 
terials constructively and creatively. 
‘An essential first step toward this 
objective sometimes must be to in- 
crease the time the child spends en- 
gaging in each activity. Fortunately, 
duration of attention can be defined, 
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observed, and reliably recorded in the 
field situation. 


MetHop 


Subject 


James was one of 16 normal children of 
middle-socioeconomic status who comprised 
the 4-year-old group in the Laboratory Pre- 
school. At the inception of the study, he was 
4 years, 6 months old and had been attend- 
ing school for 3 months. 

James was a vigorous, healthy child with 
a well-developed repertoire of motor, social, 
and intellectual skills. Although he made a 
comfortable adjustment to school during 
the first few weeks, a tendency to move 
constantly from one play activity to another, 
thereby spending little time in any one pur- 
suit, was noted early. Since such behavior is 
common to some young children in a new 
situation, his teachers merely continued 
their friendly efforts to engage him in more 
prolonged and concentrated use of materials, 

After 12 weeks James showed no diminu- 
tion in number of activity changes during 
play periods. An observer then was assigned 
to record his behavior, noting his activities 
and the time spent in each. Records kept 
over 5 school mornings showed that al- 
though occasionally James stayed with an 
activity for 1, 2, or 3 minutes, the average 
duration of an activity was less than 1 min- 
ute, The parent reported that the same kind 
of “flightiness” had long caused concern at 
home. It was agreed that a study be made 
of ways of helping James to increase his at- 
tending behavior. 


Procedure 


The procedure for increasing the duration 
of time spent in any activity was to make 
adult social reinforcement contingent solely 
on the subject/s (S's) emitting attending be- 
havior for a specified minimum period of 
time. Attending behavior was defined as en- 
gaged in one activity. This included play 
activity (a) with a single type of material, 
such as blocks or paint; (b) in a single lo- 
cation, such as in the sandbox or at a table; 
or (c) in a single dramatic Tole, such as 
sailor or fireman. Adult social reinforcement 
(Bijou & Baer, 1965) was defined as one or 
more of the following teacher behaviors: 
talking to S while facing him within a dis- 
tance d 3 feet, or from a greater distance 
using his name; touching 8; and givi i 
additional materials suitable to dide 
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activity. Withholding or withdrawing social 
reinforcement consisted of turning away 
from S; not looking or smiling at him; not 
speaking to him; and directing attention to 
some other child or activity. 

One teacher was assigned major respon- 
sibility for maintaining reinforcement con- 
tingencies. However, since the two other 
teachers might at times also deliver or with- 
hold reinforcement, each had to remain con- 
stantly aware of the conditions in force. 

The design of the study required four suc- 
cessive experimental stages, as delineated by 
Harris (1964). 

Base line. The existing rate, or operant 
level, of activity changes prior to systematic 
application of adult social reinforcement was 
recorded for several play sessions. 

Reinforcement. Social reinforcement, was 
presented immediately when attending be- 
havior had been emitted for 1 unbroken 
minute. Reinforcement was maintained con- 
tinuously until S left the material or the 
area or verbalized a change in his play role. 
Immediately consequent upon such a shift 
in play activity, social reinforcement ceased 
until 1 minute of attending behavior had 
again been emitted. The procedure was con- 
tinued until attending behaviors had mate- 
rially increased. 

Reversal. Then, to ascertain whether s0- 
cial reinforcement was in fact the determin- 
ing factor in modifying the behavior under 
study, reinforcement was again delivered on 
a noncontingent basis such as had been in 
effect during the base-line period. This re- 
versal of contingencies was carried out long 
enough to yield a clear assessment of the 
effects of the changed conditions. 

Reinstatement. During this period, the 
procedures in effect during the second stage, 
Reinforcement, were reinstituted. After at- 
tending behaviors had again increased in 
duration, the criterion for presenting social 
reinforcement was raised to 2 minutes. 


Recording 


The S's attending behavior and adult so- 
cial reinforcement, as previously defined, 
were coded and recorded in successive 10- 
second intervals by an observer using k 
stopwatch and a red flashlight with a mag- 
net attached. The recording system was ey 
ilar to that described by Allen et al., (1964). 
Each period of attending to one activity 
was enclosed in brackets. Since an indes 
in attending behavior brought a cone 
ing decrease in the number of activi 
changes, data on attending behavior were 
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counted and graphed in terms of the number 
of activity changes occurring within succes- 
sive 50-minute time units. In general, but 
not necessarily, two 50-minute periods indi- 
cated 1 day of recording of play time ex- 
clusive of teacher-structured or teacher-di- 
rected activities. 

During the two reinforcement stages, the 
observer used a flashlight to inform teachers 
when S reached criterion for social rein- 
forcement. The cue consisted of placing the 
flashlight on top of the metal clip of the 
clipboard as soon as S had emitted 1 minute, 
and later 2 minutes, of attending behavior. 
When the behavior stopped, the observer 
removed the flashlight and placed it under 
the clipboard, where it remained out of sight 
until criterion attending behavior had again 
been emitted. Teachers were instructed to 
maintain awareness of the flashlight position 
and to check it before giving S any social 
reinforcement. 

Periodically throughout the study, ob- 
server reliability on attending behaviors, 
activity changes, and adult social reinforce- 
ment was checked by an independent ob- 
server. Agreement of records ranged between 
97% and 100%. No post checks of attending 
behavior could be made because the study 
was terminated by the close of the school 
year, at which time the family moved to 
another city. 

In addition to the behavior under study, 
some assessment of whether social aspects of 
S's behavior were affected by changes in his 
attending behavior seemed desirable. There- 
fore, S's verbalizations, proximity to, and 
cooperation with other children were de- 
fined, coded, and recorded. The quality of 
the child's social behavior was estimated by 
considering cooperative behavior as high- 
level social behavior and mere proximity 
as low-level social behavior, in contrast to 
isolate behavior, which was considered non- 
social. Interrater reliability on these param- 
eters ranged between 84% and 92%. 


RESULTS 


Base line—Stage 1. The number of 
activity changes that James made in 
each of 21 successive 50-minute pe- 
riods of free-choice play, both indoors 
and out, is shown in Figure 1. The 
fewest number of activity changes 
were 33 during Period 12, with an 
average duration of 1 minute 29 
Seconds per activity. The greatest 


number of activity changes occurred 
in Period 14, with 82 changes and 
an average duration of 37 seconds 
per activity. The overall average for 
the base-line stage (the operant level 
of the behavior under study) was 
56 activity changes per 50-minute 
period, with an average duration of 
53 seconds per activity. 

The amount of teacher reinforce- 
ment presented to James on a ran- 
dom, noncontingent basis averaged 
16% of each session. This rate was 
within the normal range in this pre- 
school of amount of teacher attention 
per child. 

Reinforcement—Stage 2. This stage 
comprised seven 50-minute periods, as 
shown in Figure 1. Activity changes 
ranged from a high of 41 in Period 22 
(the first period of experimental pro- 
cedures) to a low of 19 in Period 28 
(the last period of experimental pro- 
cedures). The overall average of ac- 
tivity changes for the seven periods 
was 27, with an average duration of 
1 minute 51 seconds per activity, or 
twice that of the base-line stage. 
Teacher reinforcement during Stage 2 
averaged 38% of each period. 

Reversal—Stage 8. During the four- 
period reversal stage (Figure 1, Stage 
3), activity changes rose markedly. 
An average of 51 activity changes 
per period occurred, with an average 
duration of 59 seconds per activity. 
Both measures (number of changes 
and average duration) were compara- 
ble to the base-line stage. During the 
reversal teacher attention averaged 
14% of each period. 

Reinstatement—Stage 4-A and B. 
Reinforcement contingencies during 
Stage 4-A, Figure 1, were the same as 
those in effect during Stage 2. Under 
these conditions, the rate of activity 
changes again dropped markedly, 
with a high of 31 and a low of 12 
(Periods 33 and 36, respectively). 
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50 minute periods 
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Fia. 1. Number of activity changes of S during 50-minute periods throughout stu j 
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of activity change under noncontingent attention. Stage 2, attention contingent on Blaze 2i 
of attending. Stage 3, base-line condition. Stage 4, contingent attending as in pers gs 
at dotted line, criterion for attending raised to 2 minutes. Arrows indicate day! 


mother visited.) 


The overall average of activity 
changes for the eight periods of Stage 
4-A was 20, with an average duration 
of 214 minutes per activity. Teacher 
reinforcement during Stage 4-A av- 
eraged 31% of each period. 

In Period 41 (Figure 1, Stage 4-B) 
the criterion for delivery of social 
reinforcement was raised to 2 minutes 
of attending behavior. Some increase 
in number of activity changes oc- 
curred during Period 41, with a 


subsequent leveling off. During ber 
part of Stage 4, the greatest numbe 
of activity changes was 45, pete 
in Period 45; the fewest number, 10, 
occurred in Period 47. The average 
number of activity changes e 
Stage 4-B was 32 per period, : i 
an average duration of 1 minute s 
seconds per activity. Teacher p 
forcement averaged 33% of eal 

eriod. s 

" The overall average for Period 4 
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(A and B combined) was 27 changes 
per session, with an average dura- 
tion of 1 minute 51 seconds per ac- 
tivity. 

Social behavior. The quality of 
social behavior was defined and meas- 
ured as high, low, and isolate. Al- 
though not under experimental ma- 
nipulation, it merits remark for its 
constancy throughout the experimen- 
tal procedures. The averages per 
session were as follows: 


High Low Isolate 
Stage 1 45% 87% 16% 
Stage 2 50% 38% 12% 
Stage 3 48% 40% 18% 
Stage 4-A 49% 3995 13% 
Stage 4-B 4105 4405 15% 


These figures were well within the 
range of the preschool’s normative 
social behavior. 


Discussion 


The data presented in Figure 1 give 
strong support to the hypothesis that 
attending behavior is “teachable” in 
the sense that it can be shaped and 
maintained by teachers. Moreover, 
adult social reinforcement again ap- 
pears to be a powerful instrument for 
this purpose. When adult social rein- 
forcement was given in a systematic 
fashion, solely as an immediate con- 
Sequence of continuing attending be- 
havior, the number of activity 
changes diminished to half the num- 
ber that occurred under the more 
usual, nonsystematic adult procedures 
of the base-line and reversal stages. 

The continuing fluctuation of the 
data which occurred during each of 
the experimental periods may merit 
comment. Behavior does, of course, 
vary somewhat from day to day. The 
factors responsible for this variabil- 
ity were not brought under experi- 
mental control. Many of them are 
inherent in the field setting of a 
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preschool, and could hardly be con- 
trolled in that setting. It is apparent, 
though, that systematic control of 
adult social reinforcement, which is 
readily achieved, is sufficient to over- 
ride these factors (Baer & Wolf, 
1966). 

Two of the high points in activity 
changes, Periods 14 and 45, suggest 
possible examples of such uncon- 
trolled factors. During Periods 14 
and 45 James’s mother was present 
for the entire morning. She interacted 
with him freely each time he con- 
tacted her and went with him fre- 
quently when he requested her to 
come and look at a particular object 
or play situation. In addition, she 
made frequent suggestions that he 
“settle down” and paint her a pic- 
ture, build with blocks, or “tend to 
his own business.” The mother ap- 
peared to have more reinforcing value 
than the teachers on these novel 
occasions, a fact not surprising in 
itself. The fact that the mother was 
often reinforcing behaviors incom- 
patible with the behavior that teach- 
ers were shaping strengthened the 
original hypothesis that the child’s 
short attention span was in fact a 
function of adult social reinforcement. 

No formal attempt was made to 
secure data on the quality of James’s 
attending behaviors. In the judg- 
ment of the teachers, however, the 
quality improved steadily. During 
Stage 4, James frequently spent 15 to 
20 minutes pursuing a single activity 
such as digging, woodworking, or 
block building. Within these activities 
he made frequent excursions to get 
additional materials relevant to his 
project, such as a wheelbarrow or a 
dirt sifter. By definition, such depar- 
tures were recorded as activity 
changes, even though he returned 
and continued with the same play. 
Such occasions, clearly delineated in 
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the data, teachers considered evi- 
dence of improved quality of at- 
tending, for the side trips were 
relevant to a core activity, rather 
than a series of unrelated activity 
changes as were typical of Stages 1, 
2, and 3. The data thus are probably 
a conservative estimate of the degree 
of change produced in James's atten- 
tion span. 

The data on social behavior are of 
partieular interest for they answer in 
part the often-asked question  re- 
garding peripheral effects on overall 
behavior patterns when one aspect 
of behavior is under intensive treat- 
ment. As was indicated, there was no 
change in the quality of James's 
social interaction, already deemed 
satisfactory by teachers at the start 
of the study, though the number of 
Separate contacts did decrease, as 
was predicted, These data add to the 
evidence that only the behavior 
specifically being worked on increases 
Or decreases as a function of the 
reinforcement contingencies, 

Throughout the study, the child's 
mother was informed of procedures 
and progress in frequent parent con- 
ferences. However, no systematic at- 
tempts were made to program presen- 
tation of social reinforcement from 
the family. For one thing, the mother 
Worked, and there were frequent 
changes of babysitters, Nevertheless, 
the mother reported that James had 
"settled down" considerably at home. 
She kept no data to substantiate 
these statements, but did relate 
Several incidents which indicated 
that there was some generalization 
from preschool to home. Both the 
mother and the teachers judged that 
James was eminently more ready for 
kindergarten at the end of the study 
than he had been prior to it. The 
importance of intensive attending be- 
havior to future learning is obvious, 
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The ease of socially altering attend- 
ing behavior in either direction, while 
perhaps less obvious, is no less im- 
portant to an analysis of children's 
intellectual, perceptual, and. social 
development. 


REFERENCES 


Auten, K. E. Hart, B. M., Bue, J. S, 
Hann, F. R., & Worr, M. M. Effects of 
social reinforcement on isolate behavior 
of a nursery school child. Child Develop- 
ment, 1964, 35, 511-518. 

ALLEN, K, E. & Harris, F. R. Elimination 
of a child's excessive scratching by train- 
ing the mother in reinforcement proce- 
dures. Behaviour Research and Therapy, 
1966, 4, 79-84. 

ALLEN, K. E., Reynoups, N. J., Hannis, F. R, 
& Barn, D. M. Elimination of disruptive 
classroom behaviors of a pair of preschool 
boys through systematic control of adult 
social reinforcement, Unpublished manu- 
Script, University of Washington, 1960. 

Barr, D. M., & Worr, M. M. The reinforce- 
ment contingency in preschool and reme- 
dial education. Paper presented at the 
meeting of the Carnegie Foundation Con- 
ference on Preschool Education, Chicago, 
January 1966. 

Brsou, S. W., & Baer, D. M. Child develop- 
ment. Vol. 2. New York: Appleton- 
Century-Crofts, 1965. 

Brawtey, E. R., Harris, F. R., PETERSON, 
R. F., Arien, K. E., & Framing, R. E. Be- 
havior modification of an autistic child. 
Unpublished manuscript, University of 
Washington, 1966. 

Harns, F. R., Worr, M. M., & Barn, D. M. 
Effects of adult social reinforcement on 
child behavior. Young Children, 1964, 20, 
8-17. 

Hanns, F. R., Jounston, M. K., KELUEY, C. 
5., & Worr, M. M. Effects of positive 80- 
cial reinforcement on regressed craw! 
in a preschool child. Journal of Educa- 
tional Psychology, 1964, 55, 35-41. 8 

Harr, B. M., Auten, K. E, Bur, J. ] 
Harris, F. R., & Worr, M. M. Effects © 
Social reinforcement on operant ed 
Journal of Experimental Child Psycho 
ogy, 1964, 1, 145-153. E 

Hart, B. M., Reyxors, N. J., BRAWLEY, 5 
R., Haris, F. R, & Barn, D. M. Effeci 
of contingent and non-contingent pa 
reinforcement of the isolate behavior o s 
nursery school girl. Unpublished man 
script, University of Washington, 1966. 


eo SSE = 


CONTROL or HYPERACTIVITY BY SOCIAL REINFORCEMENT 237 


Jonnston, M. K., Kerley, C. S., Harris, F. 
R., & Worr, M. M. An application of re- 
inforcement principles to development of 
motor skills of a young child. Child De- 
velopment, 1966, 37, 379-387. 

Parrerson, G. R., Jones, R., WHITTIER, J., & 
Wriaut, M. A. A behavior modification 
technique for the hyperactive child. Be- 


haviour Research and Therapy, 1965, 2, 
217-226. 

Worr, M. M. Ristzy, T., & Mres, H. Ap- 
plication of operant conditioning pro- 
cedures to the behavior problems of an 
autistic child. Behaviour Research and 
Therapy, 1964, 1, 305-312. 

(Received August 19, 1966) 


Journal of Educational Psychology 
1967, Fol as No. 4, 238-244 


SCHOOL-RELATED ATTITUDES OF CULTURALLY 


DISADVANTAGED E 


LEMENTARY SCHOOL 


CHILDREN? 
DANIEL C. NEALE 4x» JOHN M. PROSHEK 


University 


of Minnesota 


A version of the Semantic Differential was used to sample attitudes of 


350 children in the 4th, 5th, and 


6th grades of 2 elementary schools. 


Compared to the other schools in the same city, School 1 was low on 


socioeconomic indicators, School 
School 1 had significantly 
books,” “having to keep quiet,” 
building.” Children in School 2 
“my teacher,” “father,” and 
creased, evaluative scores 

variety of stimuli, including “my 


2 was near the median, Children in 


higher evaluative scores for “my school 


“following rules,” and “my school 


were significantly more positive toward 
“college student.” As grade in school in- 
became significantly less Positive for a 


school books,” “my classroom,” “my 


teacher,” and “me.” Attitudes toward several stimuli were similar for 


boys and 
in School 2. 


girls in School 1 but markedly different for boys and girls 


A common objective of educational 
programs for culturally disadvan- 
taged children is the improvement of 
attitudes toward school and toward 
learning, Such children presumably 
have a constellation of attitudes that 
severely handicaps their school per- 
formance, for example, low evalua- 
tion of self, low level of aspiration, 
and negative feelings about school 
and school work, 

The evaluation of educational pro- 
grams for the disadvantaged is diffi- 
cult because such attitudinal ob- 
jectives are difficult to measure, An 
added problem is the fact that disad- 
vantaged children, almost by defini- 
tion, have difficulties in responding to 
existing instruments, which may in- 
volve complicated instructions or re- 
quire good reading and writing skills. 

The present Study was undertaken 


1 Research Supported in part by grants to 
the University of Minnesota, Center for Re- 
search in Human Learning, from the Na- 
tional Science Foundation (GS541), the Na- 
tional Institute of Child Health and Human 
Development (I-POI-HD-01136-01-HDP), 
and the Graduate School of the University 
of Minnesota. 


to explore the usefulness of the Se- 
mantic Differential in measuring 
school-related attitudes of culturally 
disadvantaged elementary school chil- 
dren. The Semantic Differential is ex- 
tremely flexible and, while it is a paper 
and pencil instrument, nevertheless, 
its reading and writing requirements 
are minimal. Recently, it has been 
shown to yield stable factor scores 
with children as low as Grade 2 (Di- 
Vesta, 1966; DiVesta & Dick, 1966). 

The study was also designed to 
test some prevailing assumptions 
about attitudes held by culturally 
disadvantaged children. In particu- 
lar, it was hypothesized that children 
from a school in a neighborhood low 
On socioeconomic indicators would 
make less favorable evaluations of à 
variety of school-related concepts 
than would children from a school in 
a neighborhood higher on the same 
indicators. 


Mxrgop 


Two elementary schools in Minneapolis | 


Minnesota, were chosen as sites € sj 
study. One school was chosen from the local 
Youth Development Project’s “target area"; 
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the other school was selected from a list of 
"comparison schools," studied in the Project 
as à contrast to the target area schools. 

In the target area about one-third of the 

residential buildings were rated as dilapi- 
dated or deteriorated. Although it contained 
less than 896 of the city's population, the 
area had one-third of the families on re- 
lief. One-fourth of the families had an an- 
nual income under $3,000. 'The area has also 
been characterized by high rates of crime 
and delinquency, high levels of family mo- 
bility, and low rates of owner occupancy. 
Studies conducted by Project staff members 
have documented the contrast between tar- 
get area and comparison school areas on 
these and other factors (Faunce, Bevis, & 
Murton, 1965). 
, In Table 1 are selected socioeconomic 
indicators for the two schools in the study. 
Using a composite index which takes into 
account rental rates, percentage of owner 
occupancy, school attendance rates, and 
pupil turnover rates, the first school was at 
the fifth percentile rank and the other 
School at the fifty-sixth percentile rank 
compared to. other city schools. A similar 
contrast between the schools was present in 
terms of median school years completed, 
median family income, percentage of non- 
White students, and a measure of police con- 
tacts with youth. 

The target area school community proved 
to be surprisingly high in average income 
and school years completed. Since the 1960 
census, urban renewal projects have forced 
lower-income families into the district, and 
therefore the 1960 figures are probably an 
overestimate. However, one should be cau- 
tious about generalizing the results of this 
study to certain areas of Chicago, New York, 
or other cities, where the deprivation may be 
More severe. 

The comparison school was selected to 
represent the average city school. The socio- 
economic indicators as reported in Table 1 
are consistent with that objective. 

A total of 350 children in the fourth, fifth, 
and sixth grades at the two schools re- 
sponded on 16 adjective scales to each of 
15 stimulus phrases. Following Osgood’s 
Procedures (Osgood, Suci, & Tannenbaum, 
1957), the responses were subjected to a 
Principal-component factor analysis with 
Varimax rotation to identify clusters of ad- 
Jective scales that comprised reasonable di- 
mensions of connotative meaning for the 
Subjects in the study. 

In Tables 2 and 3 rotated factor matrices 
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TABLE 1 


SELECTED SOCIOECONOMIC INDICATORS FOR 
Two ELEMENTARY SCHOOLS 


Low-SES |Middle-SES | c; 
School | School | City 
Composite index (ent, | (14 FU 
owner occu] ] 
tendance, and turn- des pl 
over)* 
Median school years 9.5 1.5 | 11.7 
completed (1960) 
medien diy income| $5,063 | $6,077 | $6,401 
of nonwhite| 27.5 i, : 
merat ae sae ees 
Percentage of 10-17 year| 11.7 3.7 5.7 
olds contacted by po-| 
lice (1964)° 


* Report of Minneapolis Publio Schools, Department 
of Administrative Revere, Apri 0, 1962. 


1960 Census. 
hj rt of Minneapolis Youth Development Proj- 
ect, August, 1965. 


are given for the children in the two schools 
in terms of the three most prominent factors. 
The evaluative dimension (Factor I) was 
by far the most prominent, accounting for 
over one-third of the total variance. Eight 
adjective scales (noted in Tables 2 and 3) 
had substantial loadings on Factor I and 
were selected to constitute the evaluative 
dimension. Responses of each child on these 
eight scales were summed for each stimulus 
phrase to represent the child's attitude to- 
ward the stimulus. 

A comparison of the factor matrices for 
the two schools indicates similar loadings of 
adjective scales on the evaluative dimen- 
sion. The only exception is the “excited- 
calm” scale, which loaded more heavily on 
Factor I in the low-socioeconomic status 
(SES) school than in the middle-SES school. 

The secondary factors (II and III) were 
not nearly as prominent as the evaluative 
dimension. Each accounted for less than 
10% of the total variance. The activity di- 
mension, frequently found by Osgood, was 
relatively clear as Factor II, and the com- 
monly found dimension of potency was & 
clear Factor III. 

Because of the weakness of Factors II 
and III, scores on these dimensions were 
not subjected to further analysis. The re- 
mainder of the study was concerned only 
with an analysis of scores on the evaluative, 
or good-bad, dimension. . 

A three-way analysis of variance was 
conducted on evaluative scores for each of 
the 15 stimulus phrases to test the effects of 


school, sex, and grade. 


TABLE 2 
ROTATED Factor MATRIX FOR 142 Fourts, 
FIFTH, AND SIXTH GRADE STUDENTS IN 
A Scoot Low on SOCIOECONOMIC 
INDICATORS 


Factor 
Adjectives 


I I m 


^ Scored as a part of an evaluative dimension. 


TABLE 3 
RoraTED Factor MATRIX ron 208 Fourrs, 
FIFTH, AND SIXTH GRADE STUDENTS IN 
A SCHOOL NEAR THE MEDIAN ON 
SOCIOECONOMIC INDICATORS 


Factor 
Adjectives n 
I I LI 
slow-fast 
new-old kd 
wise-foolish .564 
hare d .513 
interesting-boring .625 
sad-happy | .642 
id pue 417 
651 
unusual 
male-female d 
colorless-colorful EA 
fair-unfair E 
soft-hard 1851 
worried-relaxed 1437 
pleasurable-painful 1507 
Eigenyalues 5.062 | 1.457 | 1.229 | 7.747 


* Soored as part of an evaluative dimension. 


RESULTS AND Discussion 


Results of the analysis are sum- 
marized in Tables 4-7, 

In Table 4 the 15 stimulus phrases 
are ranked from most to least favor- 
able on the basis of the mean scores 
of all 350 children in the study. Since 
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responses on each adjective scale 
could range from one to seven, and 
the evaluative dimension consisted of 
the sum of responses to eight of the 
adjective scales, the scores could 
vary from 8 (least favorable) to 56 
(most favorable). Most were above 
the neutral score of 32. The overall 
ranking of the stimulus phrases was 
quite conventional. Mother and fa- 
ther were evaluated most favorably; 
fighting and stealing were evaluated 
least favorably. 

Table 4 also gives the mean scores 
for the children in the two schools 
and the F-ratio associated with the 
comparison of each pair of means. 
Children in the middle-SES school, 
compared to those in the lower-SES 
school, made significantly more pos- 
itive evaluations of “father,” “col- 
lege student,” and “my teacher.” 

Differential evaluation of “father” 


TABLE 4 
Mean Scores on EVALUATIVE DIMENSION 
OF SEMANTIC DIFFERENTIAL FOR CHIL- 
DREN IN Two Scnoors DIFFERING 
ON SELECTED SOCIOECONOMIC 


INDICATORS 
Analysis 
School eos 
Stimulus between 
Low- |Middle-| schools 
Total- SES | ses | F 
Mother 48.0 
Father 47.7 
College student 45.5 
y teacher 45.5 
Reading a book 45.1 
My school building | 43.5 
My classroom. 42.5 
Me 41.9 
My school books 41.6 
'ollowing rules 40.5 
Working arithmetic 
problems 39.7 
Talking in front of 
Having tok ict | 38:3 
AV) eep quiet a 
Fighting with other 
children 25.7 
Stealing things 18.0 


Note—For the low-SES school, N = 142; for the 
middle-SES school, N = 208; total, Ñ = 350. Bore Der 
range from 8 (least favorable) to 56 (most favorable). 
score of ae neither favorable nor unfavorable. 

P< 05. 
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may reflect the absence of strong fa- 
thers in many low-SES homes. In 
these data the difference is more pro- 
nounced for girls than for boys (see 
Table 7). Differential evaluations of 
"college student" and “my teacher" 
are consistent with the prevailing as- 
sumptions that middle-SES children 
regard college and teachers more 
favorably than do lower-SES children. 

On the other hand, children in the 
lower SES-school, compared to the 
middle-SES school, made  signifi- 
cantly more positive evaluations of 
"my school building,” “my school 
books,” “talking in front of class,” 
and “having to keep quiet.” These 
findings do not fit prevailing assump- 
tions well. 

The relatively more favorable eval- 
uation of “my school building" 
may reflect the contrast between the 
school and other buildings in the low- 
SES neighborhood. Whatever the rea- 
son, the marked contrast between 


TABLE 5 
Maan Scores on EVALUATIVE DIMENSION 
or Semantic DIFFERENTIAL FOR 
CHILDREN IN THREE GRADES 


Grade Analysis 
oí 
d variance 
Stimulus between 
Fourth | Fifth | Sixth spades 
Mother 49.0 | 47.8 | 47.3 | 2.05 
Father 48.3 | 47.2] 47.0 | .26 
College student 46.3 | 45.3 | 44.9 | -86 
My teacher 47.8 |45.4 | 43.2 | 6.98°* 
Reading a book 45.6 | 45.2 | 44.6 | 1.15 
My school building 44.7 | 43.6 | 42.2 | 2.00 
My classroom 45.1 | 43.2 | 39.4 s 
Me 44.0 | 41.8 | 40.0 
My school books 43.4 | 42.8 | 39.9 
'ollowing rules 42.7 | 40.9 | 38.1 
Working arithmetic 
problems 42.1 | 40.0} 37.1 
Talking in front of class} 41.5 | 36.6 | 35.7 
Having to keep quiet | 34.9 | 32.2 | 29.8 
Fighting with other 
children 27.3 | 26.5 | 23.5 
Stealing things 19.6 | 18.3 | 16.2 


Note.—For the fourth grade, N = 117; for the fifth 
grade, N = 111; for the sixth grade, N = 122. 
Tay range from 8 (least favorable) to 56 (most favorable). 
A score of 32 is neither favorable nor unfavorable. 
KW 


MES 
„p< 
** p < 901. 
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TABLE 6 
Mean SCORES on EVALUATIVE DIMENSION 
or SEMANTIC DIFFERENTIAL FOR 
Boys AND GIRLS 


o 
i i yariance 
Stimulus Boys | Girls | between 
sexes 
F 
Mother 46.2 49.5 |10.71*** 
Father 46.8 48.5 | 2.25 
College student 44.6 | 46.3 | 2.72 
My teacher 44.2 | 46.5 | 5.63° 
Reading a book 43.5 | 46.5 | T.04** 
My school building 41.5 | 45.2 |11.85** 
My classroom. 41.8 | 43.0 | 4.17* 
Me 41.6 | 42.2 66 
My school books 40.6 | 42.5 | 4.50* 
Following rules 39.2 | 41.0 | 6.28* 
Working arithmetic prob- 
lems 39.6 | 39.8 01 
‘Talking in front of class 37.1 38.6 | 1.69 
Having to keep quiet i 31.7 | 82.7 | .98 
ighting with other chil- 
27.4 24.9 | 4.12* 
Stealing things 19.3 16.9 | 2.78 


su ae ena th) hn era 
m favorable; favorable), 
‘A score of 32 is neither favorable nor unfavorable. 


ep < 00i. 


schools occurs in the responses of 
boys rather than girls (see Table 7). 

The relatively more favorable eval- 
uation by lower-SES children of 
school books is even more surprising. 
The fact that “reading a book” is not 
given a higher evaluation by low- 
SES children compared to middle- 
SES children may mean that low-SES 
children do value school books al- 
though they do not like to read them. 

The fact that “talking in front of 
class” and “having to keep quiet” are 
given relatively higher evaluations 
is likewise puzzling. Perhaps in the 
lower-SES schools more emphasis 
must be placed upon the importance of 
these school virtues, which are per- 
haps more taken for granted in mid- 
dle-SES schools. 

These findings seem to parallel 
those of Greenberg, Gerver, Chall, 
and Davidson (1965), who found a 
tendency for low-achieving Negroes 
in a deprived school to be more favor- 
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TABLE 7 
Meran Scorzs on EvaLvATIVE DIMENSION OF Semantic DIFFERENTIAL BY 
ScHooL anp SEX 
School 
Analysis of variance 
Stimulus Low SES Middle SES of 33 X Sex 
Boys Girls Boys Girls 

Mother 46.7 49.1 45.9 49.8 1.12 
Father 45.9 45.8 47.4 50.4 4.97* 
College student 43.5 44.7 45.2 47.4 44 
My teacher 42.7 43.9 45.0 48.4 .85 
Reading a book 44.0 44.0 42.9 48.3 7.16** 
My school building 45.5 45.8 38.9 44.7 11.90*** 
My classroom 43.1 42.3 40.0 44.5 6.48* 
Me 42.4 41.8 41.5 42.5 1.74 
My school books 42.5 43.4 39.6 41.8 .65 
Following rules 41.1 41.5 38.3 41.8 2.77 
Working arithmetic problems| 41.9 39.0 38.6 40.4 5.07* 
Talking in front of class 39.7 38.5 35.3 38.7 3.64 
Having to keep quiet 34.3 34.8 30.0 31.1 .02 
Fighting with other children | 26.4 25.9 28.7 23.1 3.93* 
Stealing things 19.4 18.5 19.7 15.7 2.05 


Note.—For low-SES school, N — 62 boys, 
7 111 girls. Scores may range from 8 (least favorable) to 56 (most favor- 
32 is neither favorable nor unfavorable. 


N = 97 boys, N 
Able). A score of 


able than high-achieving Negroes to 
some socially approved concepts. 

Contrary to expectations, children 
in the two schools did not differ sig- 
nificantly in their evaluations of 
themselves, fighting, or stealing. 

In Table 5 mean evaluative scores 
are reported for children grouped by 
grade. A striking pattern is evident 
in these data, namely, that for every 
stimulus phrase except one, students 
in Grade 6 were lower than Students 
in Grade 5 who were lower than stu- 
dents in Grade 4, Only "father" 
seemed to stem the tide of negativism. 
Many of the observed trends are 
highly significant. 

The interpretation of this general 
tendency is difficult. The fact that 
many of the pronounced differences 
are related to school, self, and duty 
may imply a growing dissatisfaction 


N = 80 girls; for middle-SES school, 


with school, a kind of progressive 
alienation. On the other hand, in- 
creasingly negative responses as grade 
in school increases may be a function 
of shifts in the way children respond 
to instruments of this type. Further 
study of this matter is clearly indi- 
cated. 

In Table 6 mean evaluative scores 
are given for boys and girls. In every 
case girls were more favorable than 
boys to “good” things (mean higher 
than 32) and less favorable than 
boys to “bad” things (mean lower 
than 32). The differences were statis- 
tically significant for a number of the 
school-related stimuli, including “my 
teacher," “reading a book,” "my 
School building,” “my classroom, 
and “my school books.” 

Perhaps more interesting than the 
main effect differences between boys 
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and girls are the interaction effects 
between school and sex reported in 
Table 7. An interesting pattern is evi- 
dent on a number of the stimuli. 
Whereas differences between boys and 
girls are quite marked in the middle- 
SES school, no such differences exist 
at the lower-SES school. For exam- 
ple, evaluations of "father" were al- 
most identical for boys and girls in 
the low-SES school, whereas girls 
were more favorable to "father" in 
the middle-SES school. A similar pat- 
tern exists for “reading a book,” 
“my school building,” and “my elass- 
room.” On this instrument, at least, 
sex differences in attitudes are not as 
marked for low-SES children as for 
middle-SES children. 


CONCLUSIONS 


The interpretation of the data pre- 
sented in this study is extremely diffi- 
cult. With such an instrument the 
subject’s responses are to some de- 
gree dependent upon his interpreta- 
tion of what responses are valued by 
the experimenter. Before much con- 
fidence can be placed on interpreta- 
tions of these data, the instrument 
should be tested in a variety of con- 
texts to estimate the extent of such 
bias. However, in spite of such res- 
ervations, the results of the present 
study do provide a number of chal- 
lenging hypotheses which might well 
be pursued further. 

In the first place comparisons be- 
tween the two schools indicate that 
culturally deprived children, as de- 
fined in this study, are not nega- 
tive about school, at least in the sense 
of devaluing school and school-re- 
lated activities. To the contrary, it 
appears that school is valued highly, 
perhaps as something difficult to at- 
tain and perhaps as a place where 
unpleasant things occur, but never- 
theless valued. It is extremely im- 


portant to keep in mind that the 
evaluative dimension on the Semantic 
Differential is subject to ambiguities 
on this point. A positive evaluation 
may reflect high valuation or a strong 
liking, and the two things do not al- 
ways go together. A person may easily 
place a high value on scholarly en- 
deavors but not like to engage in 
them. 

Second, the data in this study sug- 
gest a systematic change in atti- 
tudes toward school as a function of 
grade level. Evaluations of a variety 
of school-related phrases were in- 
creasingly negative as grade in school 
increased. The same was true for an 
evaluation of self. The trends. were 
not a function of school or sex, One 
hypothesis that must be considered 
is that children learn to like school 
less as they progress through the 
grades. If so, it suggests the schools 
are falling down on one of the most 
important proximate objectives of 
education, the cultivation of a posi- 
tive attitude toward schooling. Such 
an attitude has long been held to be a 
part of a sound motivation for school 
learning. In addition, the fact that 
evaluations of “me” were increas- 
ingly negative as a function of grade 
level implies that children like them- 
selves less as they grow older. A 
positive self-concept is also held by 
educators as a desirable objective of 
schooling. 

Of course, these data are far from 
conclusive. For example, the fact that 
evaluations of fighting and stealing 
also became increasingly negative as 
a function of grade level is not con- 
sistent with the view that children 
are becoming alienated from school 
and society. Instead it suggests a 
change in response set to the instru- 
ment, as if with increasing age one 
reports all evaluations somewhat 
more negatively. In that case the 
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data do not suggest a failure of 
schools but rather an important fact 
to be taken into account in interpret- 
ing attitude questionnaire data from 
different age groups. 

Finally, the data in the study con- 
iradiet the expectation that sex dif- 
ferences in evaluations of school that 
are common in middle-class children 
are necessarily found in culturally 
disadvantaged children, The hypothe- 
sis is rather suggested that sex-role 
differences with respect to education 
may not be as important in culturally 
disadvantaged as in middle-class 
schools, 
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ACQUISITION OF PROBLEM-SOLVING STRATEGIES 
, IN YOUNG CHILDREN AND ITS RELATION TO 
j VERBALIZATION' 
CAROLYN STERN 
University of California, Los Angeles 


That young children could learn problem-solving strategies was dem- 
onstrated (p < .01) with 107 3rd-grade children. A 2 X 2 X 2 factorial 
design with 2 levels of MA was used. The treatment conditions in- 
cluded multiple-hypotheses testing with and without speaking, and 
single-hypothesis testing with and without speaking. The problems re- 
quired the selection of 1 of 4 concepts as the basis for matching an 
exemplar to a model in each 6-slide problem. Children taught to use 
knowledge of results to test 1 concept at a time (SH) scored signifi- 
i cantly higher on posttests than did those taught to test several hy- 

potheses at once (MH). Verbalization demonstrated no effect with 
either treatment group; mental age was an important variable only in 
the acquisition of the MH strategy. 


In a series of concept-identification 
studies reported by Wittrock (1964) 
and Stern (1965), the authors repeat- 
edly observed that young children 
tended to perform at purely chance 
levels when the basis for solution was 
not provided by the experimenter. 
Under such “discovery” treatments, 
children frequently ceased attending 
to the specific stimuli of the problem 
situation and either curtailed their 
efforts or engaged in various types 
of random activity. 

These children did not generate 
techniques for problem-solving even 
when the number of trials to cri- 
terion was unlimited. On the con- 
trary, since the correct answer was 
unrelated to a discoverable sequence, 
for example, some type of alterna- 
tion pattern, and since the propor- 
tion of successes to failures was 
purely a function of chance, many 
children gave up in frustration be- 
fore arriving at a solution. Obviously, 
the invigorating and intellectually 


1This research was supported by the 
Cooperative Research Program, United 
Bate Office of Education (Project No. 
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stimulating effects of discovery, de- 
fined as an instructional procedure, 
were not experienced by those whose 
best efforts did not result in discov- 
ery, defined in terms of performance. 
Even those children who were able 
to derive their own problem solutions 
during training did not subsequently 
demonstrate significantly improved 
ability to cope with a related but 
slightly different type of problem. 
There are several ways in which 
these findings can be interpreted. 
First, they may be viewed as sup- 
porting Piaget’s theoretical postulates 
of sequential cognitive development. 
From this position it might be argued 
that young children have not yet 
reached the stage where they are 
able to perform the logical opera- 
tions essential to inferential reason- 
ing. On the other hand, accepting 
Jerome Bruner's dictum that “any 
subject can be taught to any child in 
some honest form” together with the 
programmer’s hypothesis that “there 
are no poor learners, only poor pro- 
grams,” it might be inferred that a 
simpler program, with smaller steps 
and more repetition, would be able 
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to lead children of this age to dis- 
cover how to solve concept-identifica- 
tion problems without direct assist- 
ance. 

Still another way of looking at 
these findings is that the child's abil- 
ity to cope with and profit from dis- 
covery, as a heuristic, is a function 
of his previous acquisition of skills 
of "learning-to-learn." That is, the 
ability to solve problems can be 
viewed as an instructional objective, 
analogous to motor skills or subject 
matter content. As Gagné (1966) 
points out, “The strategies of dis- 
covery ...do not have to be learned 
by discovery [p. 148]." 

A series of experiments was de- 
signed to test the hypothesis that 
children can be taught strategies 
which will subsequently improve their 
ability to “discover.” The first ex- 
periment (Stern &  Keislar, 1966) 
used four groups of third-grade chil- 
dren. Two of these groups were given 
instruction, each in a different prob- 
lem-solving strategy. The third group 
was given practice with the materials 
and problems; presumably members 
of this group would have the op- 
portunity to derive their own pro- 
cedures for solving these problems. A 
fourth group was given preliminary 
familiarization with the experimental 
situation but no opportunity to prac- 
tice with the problems. 

The results demonstrated that chil- 
dren taught strategies for solving 
problems performed significantly bet- 
ter (p < .001) with new but similar 
problems than children who had not 
been given instruction. This superi- 
ority was found only with the simpler 
of the two instructed strategies. The 
more sophisticated strategy, which 
should have produced superior prob- 
lem-solving skill, was in actuality 
superior neither to the practice-with- 
out-instruction group nor to the no- 
practice, no-instruction group. 


CAROLYN STERN 


Closer inspection of the data re- 
vealed that the performance of these 
groups was approximately chance; 
that is, the children had not learned 
to use the strategy and hence were no 
different from children who had not 
been exposed to this training. One 
source of difficulty might be that the 
children were unable to cope with the 
problems of memory storage and in- 
formation retrieval. Perhaps the chil- 
dren were unable to remember which 
concepts had been tested and affirmed 
and which ones rejected. Thus they 
might be wastefully retesting already 
disproved hypotheses. A new experi- 
ment was therefore designed to test 
whether young children could be 
taught to execute this superior strat- 
egy if a more effective instructional 
program, one which also provided 
some memory storage facilitation, 
was devised. 

Before beginning this experiment, 
several weeks were devoted to pro- 
gram development. Each phase of the 
instruetion was tried out and tested 
with about 40 children selected at 
random from the school population 
to be used in the experiment. When 
the final program was set, several new 
features had been added. The most 
important of these was a booklet 
System by which the factor of mem- 
ory load could be minimized as an 
effective variable. In addition, the 
question of the effect of verbalization 
was raised. In the earlier experiment, 
all the children had been required to 
verbalize overtly the correct con- 
cept before making a selection re- 
sponse based on that concept. The 
implied assumption, that verbaliza- 
tion would facilitate learning, was 
not tested in that design. It was quite 
possible that verbalization, contrary 
to expectation, was producing mM- 
terference, rather than facilitation. 

The present experiment was de- 
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signed to test the following hypothe- 
ses: 

1. Children taught to follow in- 
structions with a theoretically su- 
perior type of problem-solving strat- 
egy will be better problem-solvers 
when these instructions are used as 
prompts, ‘compared with children 
prompted with a set of simpler strat- 
egy instructions; 

2. When strategy prompts are re- 
moved, the children taught the more 
sophisticated strategy will continue 
to solve the same type of problem 
more successfully than children taught 
the simpler strategy; 

3. Children .who are required to 
say the correct concept aloud, under 
either strategy treatment, will be 
more competent problem-solvers than 
children who are not taught to supply 
any overt verbal cues. 


MetHop 


Apparatus 


The UCLA Group Teaching Equipment, 
described in detail in Stern (1965) with 
the voice-relay modification reported in 
Keislar, Stern, and Mace (1966), was used 
for this study. The basic components in- 
cluded 10 three-sided booths, each equipped 
with earphones, a microphone, and a mul- 
tiple-choice individual response panel, all 
connected to a master control set. A Kodak 
Carousel slide projector and a Wollensak 
stereo tape recorder insured the controlled 
presentation of the audio-visual program; 
a Clary data recorder, with Flexowriter 
tape-to-card converter, automatically re- 
corded and interpreted the individual re- 
sponses. 


Subjects 


All the third-grade children in & West 
Los Angeles elementary school (with the 
exception of approximately 40 children who 
had participated in the program-develop- 
ment phase) were included in the study. 
There were 107 children (56 boys and 51 
girls) assigned to four experimental treat- 
ments, using a stratified-random design 
based on mental age. The mean chrono- 
logical age of the experimental subjects was 
8 years 5 months, the standard deviation 


4 months; the mean mental age was 9 years 
4 months, standard deviation 1 year; the 
mean IQ was 110.6, SD 13.0. 


Procedure 


All instruction and testing were carried 
out entirely by means of autoinstructional 
programs, using the equipment and mate- 
rials described. Before the instruction began, 
all the children were given a pretest to 
see how well they could perform the prob- 
lem-solving task. Then followed 6 days of 
training, each session lasting approximately 
15 minutes. Immediately after the training, 
a posttest was given; 7 weeks later, a re- 
tention test was administered. 

Experimental task and materials. The 
basic task for this study was identical to that 
described in Stern and Keislar (1967), A 
35-mm. color slide presented a model pic- 
ture in the top center and two exemplars 
below. A problem consisted of a set of 
slides, for all of which the basis for match- 
ing the exemplar with a model was one of 
four concepts: number, color, size, or shape. 
The object was for the child to use knowl- 
edge of results to identify which of these 
four concepts was the basis for matching 
for each problem. 

Experimental treatments. There were 
two strategy programs, Multiple Hypothe- 
sis testing and Single: Hypothesis testing, 
each presented under two conditions of 
verbalization, not-speaking and speaking, 
thus providing four experimental treat- 
ments. 

Within each of the strategy treatments, 
the only difference between the speaking 
and not-speaking condition was that the 
speaking groups were required to say the 
selected hypothesis aloud, in addition to 
writing the correct rule. For instance, where 
the not-speaking groups were told: “Choose 
a rule and write it in the proper box. Press 
the button that goes nm id ne ihe 

eaking groups were to! : “Choose a rule 
odd waite iti the proper box. Say the rule. 


| Press the button that goes with this picture.” 
‘For the speaking condition, the voice-relay 


mechanism was adjusted so that if the 
child did not speak he could not receive 
a reinforcing light for his selection Tesponse. 

Memory storage. In the previous experi- 
ment, it had been observed that children had 
difficulty retaining negative feedback infor- 
mation so as to consistently reject incorrect 
solutions. To avoid this problem, children 
were supplied with booklets and pencils to 
use in aiding memory. À 

In the Multiple Hypothesis treatment, 
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the subjects were helped to focus on the two 
hypotheses associated with the correct ex- 
emplar on the first slide, and then to select 
one of these two on the basis of informa- 
tion provided with the second slide. Having 
thus attained the rule by the second slide, 
they should be: able to select the correct 
exemplar for the remaining slides of the 
problem. For the Single Hypothesis group, 
the booklets helped remind each child which 
concepts were possible and which ones had 
been tried and rejected, so that a non- 
replacement procedure could be adopted. 

The booklets consisted of four different 
types of pages appropriate for the needs of 
each strategy group and each lesson. Figure 
1 presents samples of the four page types, 
reduced in size. The actual booklets were 
‘constructed by cutting regular manuscript 
paper into thirds, width-wise, so that each 
sheet was approximately 3% X 8% inches. 
‘The number and types of pages varied for 
each day's lesson, for each of the basic 
strategy treatments. Table 1 shows the 
composition of the booklets for both pro- 
grams, 

Strategy instruction. On the first day of 
training, all treatments received the same 
instructions and slides for the initial 25 
items of the lesson. The first six slides taught 
the children how to operate the response 
panels; the next 10 slides presented the 
concept-identification task in terms of two 
concepts, color and shape. The booklets 
were then distributed and the children were 
taught to write the first letter of the correct 
concept, as illustrated in nine slides, in the 
appropriate box on the Type A booklet 
page (see Figure 1). 

The Multiple Hypothesis groups were 
then shown six slides in which one of the 
exemplars was the same as the model in 
both color and shape. As the first basic step 
in the Multiple Hypothesis testing proce- 
dure, the child had to write both of these 


Type A 


Type 8 Type c Type D 


Fra. 1. Sample sheets from booklets for 
training program, 
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TABLE 1 


Composition or BOOKLETS Over 
Six Lessons 


Multiple hypothesis Single hypothesis 
Lesson program program 
1 5 Pages of Type A | 10 Pages of Type A 
2 Pages of Type B 
2 4 Pages of Type A 6 Pages of Type C 
2 Pages of Type B. 4 Pages of Type A 
3 Bame as Lesson 2 4 Pages of Type A 
8 Pages of Type D 
4 2 Pages of Type A 5 Pages of Type 
4 Pages of Type D 5 Pages of Type C 
5 3 Pages of Type B Howe 
'ages of Type 
6 2 Pages of TypeB | 5 Pages of Type D 


rules in the same box, corresponding to the 
position of the exemplar on the screen. The 
children were then taught the second basic 
step in this strategy: to select the one 
correct concept from the two identified as 
associated with the correct exemplar. They 
were given six four-slide problems in which 
they used the Type B pages. 

The Single Hypothesis groups, on the 
other hand, completed the first lesson with 
practice in selecting a concept, associating 
it with a specific exemplar, and testing to 
see if it was correct, using only Type A 
pages. 

In the first lesson, only two concepts 
were used, color and shape. The third con- 
cept, size, was introduced on the second day 
and the fourth concept, number, on the third 
day. After the fourth concept had been 
identified, a card (Figure 2) was hung on 
the front panel of each booth, listing the 
repertoire of rules from which problem 
solutions could be drawn. This was another 
way in which the memory load of the ex- 
perimental task was reduced for all groups. 

While the visual stimuli were not identi- 
cal for the two programs, as they were in 
the first experiment, the same slides and 
similar tasks were provided wherever possi- 
ble. On the fifth and sixth days of training, 
the slides and problems were identical for 
all treatment groups; only the strategy 
instructions varied. 


Criterion Tests 


The posttest, given after the last day of 
training, consisted of nine six-slide prob- 
lems, similar to those used during training. 
Two months later, the retention test, using 
the same slides but in different problem 
sets, was administered. 
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Number 
Color 


siZe 
Shape 


Fia. 2. Rule card. 


RESULTS 


The means and standard devia- 
tions on all measures for various 
treatment subgroups are presented in 
Table 2. 

Before any strategy training was 
instituted, all the children were given 
a problem-solving pretest to deter- 
mine how well they could perform 
the experimental task before train- 
ing. The scores on this two-choice 
task were at a chance level, since 
the groups averaged approximately 10 
points out of 20, Evidently without 
training of some kind the problem- 
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solving task adopted for this project 
was too difficult for these children. 

For each of the experimental 
groups, the importance of the gain 
made during training can be assessed 
in terms of the difference between the 
scores on the problem-solving pre- 
test and the training test. Here both 
of the within-treatment differences 
were significant (t = 8.4 and 9.3, p 
< .001). These gains remained at a 
reliable level (t = 45, p < .01 
and t = 94, p < .001) for the 
Multiple and Single Hypothesis 
treatments, respectively, on the post- 
test which did not provide the pro- 
cedural steps for each of the strat- 
egies. 

In order to compare the effective- 
ness of the new programs with those 
used in the previous study, the data 
of that study were reanalyzed and 
scores on training and posttest ob- 
tained using the four test items on 
each of the last five problems. Table 
3 includes these recomputed scores, 
together with comparable scores from 
the present study. These data show a 
significant difference (t = 33, p < 
01) in favor of the Multiple Hypoth- 
esis treatment on the training test in 


TABLE 2 
MEANS AND STANDARD DEVIATIONS ON DEPENDENT AND INDEPENDENT VARIABLES 
[ej Pretest* | Training? |  Posttest | Retention test 
Treatment 43.2 
N sp | M | SD 
Multiple hypothesis 
p MEE Hun 
Not i r k 
metal ndm a 4.1 | 15.0 | 3.7 
Single hypothesis 
Speaking 26 3.1 | 15.6 | 2.2 
Not speaking 26 3.2 | 16.5 | 3.0 
Total 52 3.1 | 16.0 | 2.7 


Note—All tests consisted of 20 items. 

* This test, administered before training, 

> This test was given on the last day of 
moved. 


was another form of the posttest. 


training before the strategy prompts were re- 
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TABLE 3 


Comparison OF THE EFFECTIVENESS OF STRATEGY PROGRAMS USED IN Two Sruprzs, 
Basep on TRAINING AND Postrest MEAN SconEs 


Treatment Study* N 
Multiple hypothesis 1 26 
Multiple hypothesis 2 55 
Single hypothesis 1 29 
Single hypothesis 2 53 


^ Btudy 1 refers to the previously-cited experiment (Stern 


refers to the present experiment. 
f 01. 


** p < 001. 


the first study, which was not found 
with the revised programs used in 
the second study. Comparing the 
training test and the posttest, how- 
ever, there was a significant differ- 
ence between these two tests with the 
Multiple Hypothesis treatment in 
both studies, The Single Hypothesis 
groups showed reliable decrements 
when their prompts were removed. 
Evidently . the Single Hypothesis 
groups internalized the instructions 
they had been given and were able to 
apply the procedure with or without 
prompts. 

Analyses of variance and covari- 
ance were computed, using the pre- 
test score as the covariable in the 
latter analyses. No differences in sig- 
nificance levels were found, reflect- 
ing the fact that the correlations be- 
tween pretest. and posttest scores 
were .08 and —.10 for the Multiple 
and Single Hypothesis strategies, re- 
spectively. Since the analyses of var- 
lance are considered more conserva- 
tive statistical tests, only these are 
reported. Significant main effects 
were obtained for treatment (p< 
.01) confirming the superiority of the 
Single Hypothesis strategy. No sig- 
nificant differences were obtained for 
the verbalization variable. The group 


Training Posttest ds | ea 
t 
2. 3.3* 
4. 1.0 
3. 3.6** 
3. 3.2* 
& Keislar, 1966). Study 2 


which spoke aloud the basis for prob- 
lem solution was not measurably su- 
perior to the group which was not 
required to verbalize. r 

Mental age proved to be an im- 
portant variable in the training and 
posttest. measures. This reflects the 
fact that the posttest scores for the 
high MA Multiple and Single Hy- 
pothesis treatment groups were 15.0, 
SD 3.5, and 17.0, SD 2.6, respec- 
tively; the low MA groups scored 
12.9, SD 4.3 and 15.2, SD 3.4 on 
these same tests. However, there were 
no reliable interaction effects with 
verbalization or treatment which 
could be attributed to differences in 
mental age. No differences held up 
after a 2-month retention period. 

While the Single Hypothesis group 
scored approximately the same on 
the retention test as they had on the 
posttest, the Multiple Hypothesis 
treatment showed an appreciable in- 
crement over their posttest scores, 
achieving a mean score only slightly 
below that of the Single Hypothesis 
group. 

Discussion 
Although this experiment did not 


include a control group, comparing 
the score on the pretest for problem- 
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solving with the scores on the train- 
ing and posttests demonstrates that 
children can be taught strategies for 
solving certain problems, and that 
they will then be better able to solve 
problems which they were not able 
to solve before this training. These 
data support findings of the earlier 
study (Stern & Keislar, 1967) in 
which an untrained control had been 
included in the experimental de- 
sign. In that study, neither of the 
treatment groups had performed as 
well as could be expected on the basis 
of mathematical probability. Reex- 
amination of the two procedures re- 
vealed that very different tasks were 
involved. Since the primary objective 
of the training programs was to bring 
all the children up to an acceptable 
level of competence in the use of the 
treatment strategy before testing 
for effectiveness of problem-solving 
with that strategy, it could be argued 
that it was not necessary to adhere 
to training programs in which the 
items were strictly comparable. It 
was therefore decided to prepare the 
most effective programs possible for 
each of the strategy procedures. 

The revised program for the Mul- 
tiple Hypothesis strategy did produce 
performance comparable to that of 
the new Single Hypothesis program 
under the training conditions. How- 
ever, on the posttest, when the 
prompts were removed, the Single 
Hypothesis procedure was again sig- 
nificantly superior to that of the 
Multiple Hypothesis procedure. Evi- 
dently, young children are unable to 
profit fully from the advantages of 
the sophisticated strategy, whereas 
they can use the simpler strategy 
with optimum efficiency. Thus the 
first hypothesis was neither supported 
nor disconfirmed, although there 
seemed to be some basis to conclude 
that successively testing hypotheses 


produces better problem solving with 
the population tested than the more 
economical scanning procedure. 

The third hypothesis of the present 
study was not supported. This is con- 
trary to the findings of Weir and 
Stevenson (1959), as well as a num- 
ber of other studies cited in that re- 
port, which demonstrated that ver- 
balization concerning the stimuli 
aided learning at all age levels. 
These investigators also reported that 
they had found no evidence to sup- 
port the prediction that verbalization 
would be of less value to older chil- 
ren, who, presumably, could use their 
own implicit verbalizations of the 
appropriate labels as mediators in 
solving problems, It is interesting to 
note that the “older” children re- 
ferred to were 9 years of age, or ap- 
proximately the same age (both 
chronologically and mentally) as the 
children used in the present study. 

As previously noted, all the chil- 
dren in the first study were required 
to verbalize the problem solution. The 
fact that the mean scores obtained 
in that study were generally lower 
than those of the present study, either 
with or without verbalization, lends 
some inferential support to the con- 
clusion that requiring the production 
of an overt verbal response does not 
facilitate learning, Perhaps the con- 
cepts which were the basis of problem 
solution were so familiar to all the 
children that the self-cueing effect 
of overt speech was of minimal im- 
portance. 

In conclusion, the present study 
provides further evidence that young 
children can be taught strategies 
which will improve their ability to 
solve certain types of problems. It 
seems that a simple strategy is more 
effective with this age group, and 
that, with the familiar materials 
used, requiring the overt verbaliza- 
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tion of the basis of problem solution 
provides no reliable difference in the 
efficiency with which such problem 
solutions are attained. 

The content used for the concept- 
identification problems was of a sim- 
ple, primary stimulus nature. The 
rationale for this selection was to re- 
duce as much as possible any con- 
founding effects of individual differ- 
ences in degree of familiarity with 
the material. In this way it would be 
easier to look at the children's ability 
to utilize the strategies. An underly- 
ing motif of the study was to de- 
termine whether children could be 
taught the logical operations which 
comprised the component elements of 
the strategies at younger ages than 
would be expected on the basis of 
the Piaget formulations. It also at- 
tempted to devise a system of ex- 
ternalizing the children’s responses 
so as to make it possible for the 
experimenter to engage in formative 
evaluation, that is, to assess progress 
at each of the steps in the process. 

Whether the strategies taught 
would have transfer value in quite 
different types of contexts was not 
tested, It seems reasonable to assume 


that opportunity to practice the strat- 
egies with a wide variety of materials 
would be essential if a generalized 
applieation of a problem-solving heu- 
ristic is to be attained. 
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ACQUISITION AND RETENTION OF DISCRIMINATION 


LEARNING SETS IN LOWER-CLASS 
PRESCHOOL CHILDREN: 


PHYLLIS A. KATZ 
New York University 


Experiment I investigated the effects of programming sequences on the 
learning set acquisition of lower-class nursery school children. 1 group 
of 12 Ss was given simple discrimination problems which progressively 
became more difficult. A 2nd group of 12 Ss received simple problems 
which abruptly shifted in difficulty level. A 3rd group of 12 Ss received 
an equivalent number of only difficult problems. The 3 groups did not 
differ in their performance on criterion problems, however all Ss ex- 
hibited positive transfer relative to a control group which did not 
receive prior training. A significant Treatment X Day X IQ interaction 
indicated that the various programming sequences elicited differential 
acquisition curves among the normal and low IQ children. The problem 
sequence appeared to affect the performance of low IQ Ss more than 
those Ss of normal intelligence. The second experiment assessed the 
retention effects of the learning set acquisition. À group of 24 8s was 


tested 6 mos. after participation in Experiment I and their perform- 
ance was compared to a control group matched for IQ. The group that 
had received prior training exhibited faster learning than the control, 
thus demonstrating retention after a 6-mo. interval. 


The phenomenon of discrimination 
learning sets has recently received a 
great deal of attention, and an ex- 
panding body of literature has demon- 
strated its occurrence across a wide 
variety of species, including rats 
(Koronakos & Arnold, 1957), cats 
(Warren & Baron, 1956), monkeys 
(Harlow, 1949; Miles, 1957), marmo- 
sets (Miles & Meyer, 1956), chimpan- 
zees (Hayes, Thompson, & Hayes, 
1953), and normal and retarded chil- 
dren (Ellis, Girardeau, & Pryer, 1962; 
Koch & Meyer, 1959; Shepard, 1957). 
Learning set refers to positive intra- 
problem transfer which occurs when 
subjects (Ss) are given practice on an 
extensive series of similar types of 
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problems. Research in this area to 
date has been largely concerned with 
the role of such gross organismic vari- 
ables as phylogenetic differences (e.g., 
Ellis et al, 1962), intelligence level 
(Jensen, 1963; Kaufman & Peterson, 
1958; Koch & Meyer, 1959) and de- 
velopmental status (Harlow, Harlow, 
Rueping, & Mason, 1960). Despite the 
clear relevance of learning sets to edu- 
cational practice, little experimental 
attention has thus far been devoted 
to the issue of how certain manipulable 
characteristics of the task itself may 
influence either the speed of learning 
set formation or its retention. Such 
issues as how the difficulty of the dis- 
crimination problem affects rate of 
learning set acquisition, what the 
most efficacious methods of stimulus 
presentation and programming are, 
and the kinds of testing conditions 
which give rise to long-range reten- 
tion all appear to be in need of in- 
vestigation. 

The major purpose of the present 
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study was to obtain information re- 
garding the role of task difficulty in 
the formation of discrimination learn- 
ing sets by nursery school children. 
The difficulty level of the discrimina- 
tion tasks was operationally defined 
in the present investigation as a func- 
tion of the number of stimulus at- 
tributes common to the two discrim- 
inanda. Thus, two stimuli grossly 
differing with respect to color, shape, 
and size were regarded as constituting 
a simple discrimination, whereas two 
stimuli differing in only one of these 
attributes were regarded as a more 
difficult discrimination. This forma- 
tion is generally in accordance with 
the theory postulated by Bourne and 
Restle (1959). If the criterion task 
consists of difficult discrimination 
problems, the question may be raised 
as to which type of training sequence 
elicits maximal transfer. There are at 
least three types of programming se- 
quences which might logically be em- 
ployed. One technique might begin 
with relatively simple problems which 
become progressively more difficult. 
An alternative method is to begin with 
simple problems and abruptly shift to 
difficult problems. A third possibility 
is to begin with difficult problems and 
maintain the same level of difficulty 
throughout the training. The present, 
study attempted to assess the effects 
of these three types of training se- 
quences upon the rate of learning set 
formation. 

If the learning set phenomenon is 
as important in the educational proc- 
ess as many investigators have noted 
(e.g, Bruner, 1964; Harlow, 1949, 
1959), then it would be expected that 
learning sets, once formed, should be 
retained by the organism over rela- 
tively long time periods, There has 
been, however, no experimental dem- 
onstration which supports this ex- 
pectation. Thus, an additional purpose 
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of the present investigation was to 
obtain information regarding the re- 
tention aspects of learning sets. To- 
wards this end, preschool children 
were retested 6 months after their 
initial training in order to assess the 
degree of retention. 


Mernop 


Experiment I 


The Ss were 48 children enrolled in an 
experimental nursery school program spe- 
cifically designed for a culturally deprived 
population. The mean chronological age of 
the sample was 52 months. The children 
were from lower socioeconomic backgrounds, 
and most of them were Negro. Their Stan- 
ford-Binet IQ’s ranged from 69 to 111 and 
the mean IQ was 92. 

The stimuli employed consisted of pairs 
of pictures of geometric figures differing in 
shape, size, and color. The apparatus em- 
ployed to present the stimuli has been pre- 
viously described in detail by Kendler 
(1959). It consisted of two apertures which 
exposed the stimuli, and two levers under 
the apertures. The child was instructed to 
“press the stick under the picture you think 
is a winner.” If a correct choice was made, 
a marble was automatically delivered to S 
in a cup under the stick. For an incorrect 
choice, S had to return a marble to the ex- 
perimenter (E). He was given six marbles 
at the beginning of each problem which were 
placed in a marble board. The S was in- 
structed that if he won “enough marbles,” 
they could be traded in at the end for 
prizes, which included modeling clay, water 
pistols, crayons, jacks, and toy trains. The 
prizes were exhibited to the children at the 
beginning of the experimental session. 

The Ss were dichotomized with respect to 
IQ. The low group had a mean IQ of 842 
whereas the corresponding figure for the 
high group was 102.4. Within each intelli- 
gence classification, Ss were randomly as- 
signed to four experimental treatment 
groups, which differed according to the kind 
and number of problems presented during 
training. For all groups, three criterion 
problems were administered. These criterion 
problems, the “difficult” ones, consisted of 
two discriminanda differing along a single 
dimension, either color or shape. The var- 
ious experimental groups can be describe 
as follows: 
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1. Group I received six easy problems 
(differing along three dimensions) on Day 
1, six problems of intermediate difficulty 
(differing along two dimensions) on Day 2, 
and three difficult problems (differing along 
one dimension) on Day 3; 

2. Group II received six easy problems 
on Day 1, six difficult problems on Day 2, 
and three difficult problems on Day 3; 

3. Group III received six difficult prob- 
lems on Day 1, six difficult problems on Day 
2, and three difficult problems on Day 3; 

4, Group IV received only the three diffi- 
cult criterion problems on one day of test- 
ing. 

A maximum of 20 trials of each problem 
was administered. The position of the rein- 
forced stimulus varied between the right 
and left apertures. This variation was ran- 
domly determined in advance with the lim- 
itation that the reinforced stimulus did not 
recur in the same position more than twice 
in succession, The problem was discon- 
tinued after either 20 trials or three consecu- 
tive correct choices. 


Experiment II 

Six months after the original training, 24 
Ss were randomly selected from the low and 
high IQ conditions of Groups I, II, and III 
in the first experiment. These children com- 
prised the experimental group. A control 
group of 24 Ss, matched for IQ, was selected 
Írom the same nursery schools. These chil- 
dren had not participated in the earlier 
study. 

All Ss participating in Experiment II re- 
ceived 20 trials of three difficult discrimi- 
nation problems, administered on the same 
day. The apparatus, instructions, and rein- 
forcement procedure used were the same as 
described in Experiment I. 


RESULTS 


Experiment I 


. The mean number of trials to criter- 
ion (i.e. to three consecutive correct 
responses) for each group on the 
three criterion problems is presented 
in Table 1. A comparison of the train- 
ing groups (Group I, II, and III) with 
the control group (Group IV) was 
significant at the .01 level (Fi4s = 
7.85) , indicating that the experimental 
groups were superior on the criterion 


TABLE 1 


Mean NUMBER or TRIALS To CRITERION 
on CRITERION Task 


Group 


problems to the group receiving no 
prior training. The difference among 
the means of the three training groups, 
compared by means of an analysis of 
variance, was not statistically signifi- 
cant. Thus, it may be concluded that 
all three training groups exhibited 
positive transfer relative to the con- 
trol group, however the particular 
treatment employed did not influence 
the degree of learning exhibited on 
the criterion problems. 

In order to assess whether or not 
the particular programming sequences 
had any effect upon the rate of learn- 
ing set acquisition, a repeated mea- 
sures analysis of variance (Lindquist, 
1956) was conducted on the mean 
trials to criterion scores on the three 
days of testing for Groups I, II, and 
III. These means are presented in Fig- 
ure 1. The analysis of these scores re- 
vealed that the main effect of day 
was significant (F2,60 = 9.50, p < 01). 
This finding indicates the general im- 
provement exhibited by the total sam- 
ple on each day of testing. The means 
for Days 1, 2, and 3 were 6.2, 3.9, and 
2.6, respectively. It can be seen in 
Figure 1 that most of the positive 
transfer was observable on the second 
day of testing. The analysis also in- 
dicated that the interaction of Day 
x Treatment x IQ was significant 
(Faeo = 3.71, p < .05). This interac- 
tion indicates that the various pro- 
gramming sequences elicited differen- 
tial acquisition rates among the high 
and low IQ children. As can be seen in 
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MEAN TRIALS TO CRITERION 


1 2 3 
DAY 


Fic. 1, The mean number of trials re- 
quired to reach criterion for each group on 
each day of testing. 


Figure 1, the high IQ groups generally 
show a gradual improvement over the 
3 days of testing under all experimen- 
tal conditions, whereas the low IQ 
children in Groups I and II actually 
exhibit a decrement in performance 
from Day 2 to Day 3. Thus, it would 
appear that for low IQ children, con- 
sistency of problem type may be a 
necessary condition for learning set 
acquisition (as was the case in Group 
III), whereas the particular program- 
ming sequence may not influence the 
brighter children at all. 


Experiment II 


An analysis of variance conducted 
on the mean trials to criterion scores 
of the control and experimental groups 
on the first discrimination problem 
indicated no significant differences. 
The possibility exists, however, that 
retention effects might be observable 
in terms of a faster rate of positive 
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transfer. Consequently, the degree of 
improvement exhibited by each S from 
Problem 1 to Problem 3 was analyzed. 
"These scores are presented in Table 2. 

The analysis of variance conducted 
on these difference scores yielded a 
treatments effect significant at the 
05 level (Fiss = 4.47), indicating 
that the experimental group showed 
more improvement than the control 
Ss (an improvement of 44% compared 
with a decrement of 7%). Thus, Ss 
that had been previously exposed to 
discrimination training showed greater 
transfer. In addition, the Treatment 
X IQ interaction was significant 
(Fi s4 = 5.61, p < .05), indicating a 
differential pattern of improvement 
for the high and low IQ Ss in the two 
conditions. For the experimental 
group, the high IQ Ss showed more 
improvement than the low IQ Ss, 
whereas the opposite was true in the 
control group. 


TABLE 2 
Mean TRIALS To CRITERION ON PROBLEMS 
ADMINISTERED 6 MoNTHS AFTER 
ORIGINAL TESTING 


Problem 
Group Difference 
1 3 
Experimental 
High IQ 10.18 | 5.36 | 4.82 
Low IQ 7.92 | 4.75 3.17 
Control 
High IQ 5.75 | 9.92 | —4.17 
Low IQ 10.92 | 8.00 | 2.92 
DISCUSSION 


The findings of the present investi- 
gation indicate that the type and 
sequence of discrimination problems 
employed may influence the rate of 
learning set acquisition. This effect 
is not a simple one, however, since 
the various groups differed only in 
the shape of the learning curves, and 
not in their performance on the criter- 
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ion set of problems. The absence of a 
general treatment effect with regard 
to the criterion problems is somewhat 
puzzling. There are at least two possi- 
ble explanations for this negative find- 
ing. The first is that the range of 
problem diffieulty may not have been 
broad enough to accurately reveal 
transfer differences. In the present 
study discrimination difficulty was 
determined in advance on the basis 
of a theoretical model based upon the 
number of varying stimulus attributes. 
It is conceivable, however, that the 
children employed were not attending 
equally to all stimulus dimensions. It 
would seem advisable for future work 
in this area to assess difficulty level 
empirically in advance. The results of 
the present investigation suggest that 
discrimination difficulty level may be 
a function of intelligence as well as 
stimulus complexity. Another possible 
explanation for the lack of criterion 
differences is that the number of 
training trials was too large and thus 
may have masked any differences 
attributable to program sequence. It 
is possible that a smaller number of 
practice trials in each treatment group 
would reveal differences in program- 
ming efficiency. Future work is indi- 
cated to test this possibility. The 
major factor influencing performance 
in this study appeared to be amount 
of training received, since all three 
training groups were superior to a 
control group that did not receive 
prior training. Since Ss were drawn 
from a population generally regarded 
as deficient in learning experiences, it 
is interesting to note that the training 
used in the present experiment did in- 
deed elicit learning sets. 

There is some evidence that the 
particular programming of the prob- 
lems may be a more significant varia- 
ble in the performance of low than in 
normal IQ children. For a group with 


normal intelligence, the particular 
problem sequence employed in this 
study did not seem to influence rate of 
acquisition although the sequence 
variable did affect the performance of 
low IQ Ss. For this latter group, con- 
sistency of problem type appeared to 
be the most efficacious presentation. 
The implication of this Treatment X 
Trials x IQ interaction for educa- 
tional practice is that more careful 
attention may have to be given to 
the training experiences of children 
who are most deficient in learning 
skills. 

The issue of how intelligence level 
is related to learning set performance 
is not entirely clear. Although a num- 
ber of investigators have found that 
differences in rate of learning set ac- 
quisition are positively associated with 
intelligence level (e.g, Ellis et al., 
1962; Jensen, 1963; Kaufman & Peter- 
son, 1958; Koch & Meyer, 1959), 
other investigators have not obtained 
this relationship (e.g, Wischner & 
O'Donnell, 1962). The findings of the 
present investigation suggest that this 
relationship may occur only under 
certain kinds of experimental condi- 
tions. In earlier studies IQ differences 
in learning set acquisition have been 
obtained when the discrimination task 
employed has been relatively difficult 
or the success criterion very high (e.g. 
9095 correct in the Kaufman & Peter- 
son, 1958, and Koch & Meyer, 1959, 
studies). Where the success eriterion 
has been less stringent (e.g., two con- 
secutive choices in the Wischner & 
O'Donnell, 1962, study, or three con- 
secutive choices in the present study), 
differenees between IQ groups have 
been less pronounced. Future research 
is indicated to clarify the relation of 
learning set acquisition to intelligence 
under a wider variety of experimental 
procedures. 

The present study supports the ex- 
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pectation that when learning sets are 
acquired, they are retained by the 
organism over relatively long periods 
of time. This retention effect, once 
again, is not a simple one. The results 
of the second experiment indicated 
that a group previously exposed to 
training did not differ from a control 
group on the first problem presented 
after a 6-month interval. When im- 
provement over a series of three prob- 
lems was considered, however, the 
experimental group exhibited signifi- 
cantly more improvement. This sug- 
gests that the effect is not one of 
immediate memory, but rather one 
of enhanced speed of learning. Clearly, 
future investigations are needed which 
systematically vary the length of the 
retention interval and the type of 
original training in order to discover 
the most efficacious combinations. 
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LECTURE VERSUS PARTICIPATION IN THE HEALTH 
TRAINING OF PEACE CORPS VOLUNTEERS 


JEAN S. KERRICK, VIRGINIA A. CLARK, an» DONALD T. RICE 
School of Public Health, University of California, Los Angeles 


A 2 X 2 factorial design with lecture versus participation as 1 variable 
and teaching team as the other was used to assess the health training 
of 272 Peace Corps volunteers. Before-after training tests were given 
for knowledge, attitudes, and beliefs about specific illnesses, and in- 
tention to follow recommended personal and community health be- 
haviors. Results indicate that, with the specific task of health training, 
participation is not more effective than lecture, although teachers or 
teaching teams do differ in effectiveness. Further, if students do have 
a preference, it is for the lecture method. 


When Peace Corps volunteers move 

from this country to overseas assign- 
ments, they move from an environ- 
ment where the major diseases are 
those of old age to societies where 
the major diseases are still communi- 
cable diseases avoided only by be- 
havior. 
, Health training, then, is only par- 
tially designed to impart knowledge. 
A major task is that of changing at- 
titudes and intended health behav- 
ior. Volunteers must be persuaded to 
perform tasks which are annoying, 
repetitive, unpleasant, and some- 
times embarrassing. For example, in 
some areas they must boil all water, 
even for bathing, they must not eat 
lettuce, ice cream, or other suspect 
foods, and they must often refuse to 
eat foods offered them by their hosts. 

Two major lines of research lead 
to the present study. For more than 
40 years, educators have been experi- 
menting with lecture versus partici- 
pation as classroom methods. For 
more than 20 years, psychologists 
and others have been concerned with 
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the two methods for short-term per- 
suasion. 

Studies as early as 1928 (Spence, 
1928), and as late as 1954 (Ruja, 
1954), indicate that, in the classroom, 
the lecture method is either superior 
io participation methods in impart- 
ing knowledge (as measured on mul- 
tiple-choice information tests), or it is 
not significantly different from par- 
ticipation or discussion methods. 

On the other hand, studies during 
the 1940s by Lewin (1947) and his 
associates clearly demonstrated the 
superiority of discussion or partici- 
pation methods in changing reported 
behavior. While major variables in 
the early studies were confounded, 
Bennett (1955) clarified the issue. 
She concluded that the methods, per 
se, did not differ in effectiveness, but 
that the decision to act is the effective 
variable in increasing the probability 
of both reported and actual behavior. 

The present study extends both 
lines of previous research. It extends 
the study of classroom learning by 
using a counterbalanced design to 
determine the effectiveness of the 
teacher under both methods, and to 
test the interaction between teacher 
and method. Second, it extends the 
time-span normally used in the stud- 
ies of persuasive effect. 

More important, however, the 
study focuses on à kind of behavior 
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unlike those possible in the usual 
experiment. In the earlier studies 
there were only minor unpleasant 
consequences of recommended be- 
haviors. Even studies of risk-taking 
behavior (Bem, Wallach, & Kogan, 
1965) use unpleasant potential con- 
sequences such as nausea and dizzi- 
ness for only an hours’ time. These 
studies have suggested that partici- 
pation methods may be harmful 
when the consequences of behavior 
are unpleasant. But such minor un- 
pleasant consequences can hardly be 
compared with potential effects of 
malaria, amebic dysentery, hook- 
worm, or schistosomiasis, 

The present study, then, examines 
lecture versus participation method 
in a classroom setting where the de- 
sired result is the extensive modifica- 
tion of behavior to reduce the risk 
of realistic illness with highly un- 
pleasant consequences. 


DESIGN AND PROCEDURE 


Course Content 


. The first half of the 20-hour health-train- 
ing course was focused on communicable 
disease theory; the classification, mode of 
spread of infectious diseases, immunity, and 
behavioral defenses against disease were 
studied. 

Diseases prevalent in the host country, 
food- and water-borne diseases, respiratory, 
skin, zoonotic and insect-borne diseases, were 
described in relation to sanitation, environ- 
ment, and recommended personal hygiene. 

The second segment of the course covered 
problems of culture and health, community 
health education, school health and educa- 
tion, and community action related to 
health. 

All instructors in the training course were 
given a standard outline of content and a 
list of teaching aids such as films, tapes, and 
readings. Each student was given a copy of 
the Peace Corps Health Manual, outlining 
the major illnesses and symptoms and treat- 
ments of prevalent diseases. 


Experimental Design 


The present study employed a 2 x 2 
factorial design with lecture method and 
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participation method of teaching as one 
variable, and teaching team as the other, 

Each teaching team was composed of a 
physician to teach personal health and a 
health educator to teach community health. 
Physicians and health educators were se- 
lected because they were experienced in both 
teaching methods, and were arbitrarily 
paired. Each team taught two sections, one 
by the lecture method, and the other by the 
participation method. 

In lecture sections, instructors were asked 
to limit class questions and communica- 
tion among students to 10 minutes. In par- 
ticipation sections, formal lecture was 
limited to 15 to 20 minutes per session, with 
30 to 35 minutes of student-to-student in- 
teraction, emphasizing group discussion, 
buzz groups, role-playing situations, panels, 
and so forth. 

Subjects (Ss) were 323 Peace Corps 
trainees destined for Ethiopia as teachers 
in secondary and higher elementary schools. 
For each of two daily periods available for 
health training, 50 Ss were randomly as- 
signed to the participation section and the 
remainder were put into the lecture group. 
Of the original Ss, 293 completed training 
and 21 failed to complete the tests for the 
present study. The results are thus based 
on 272 trainees. 


Measurements 


Before and after health training, all 
trainees were given tests of factual knowl- 
edge, attitudes and beliefs about specific 
illnesses, and intention to follow recom- 
mended personal and community health 
behaviors. 

The knowledge test consisted of 24 mul- 
tiple-choice items and 10 matching ques- 
tions. For the most part, information items 
had been used in earlier Peace Corps health 
training courses at the University of Cali- 
fornia, Los Angeles. 

A previously constructed form of the se 
mantic differential was used to assess atti- 
tudes toward illness (which we have called 
“severity”), and belief in the illness’ avoid- 
ability or unavoidability? the two major 
factors isolated in a study of an illness for 
of the differential. The illness test obtaine 
responses to 25 illnesses likely to occur 
Overseas, 


*Qur distinction between attitude and . 


belief is the same that has been made by 
previous authors such as Krech and Crutch- 
field (1948), and Allport (1954), and oper 
tionally specified by Fishbein and Ravel 
(1962). 


í 
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A special form of the differential was con- 
structed to determine expressed likelihood 
or intention to follow the major recom- 
mended health-related behaviors. Twenty- 
seven behaviors were included in the test. 
Expression of high "likelihood" of follow- 
ing behaviors appears directly comparable 
with measures of private commitment to 
behavior (Bennett, 1955). 

A course-reaction questionnaire was given 
all four sections at the end of health train- 
ing. The questionnaire, based on comments 
from previous volunteers, called for ratings 
of health training. Five of the eleven items 
were reversed, so that approximately half 
of the items were phrased as favorable, half 
unfavorable. Most items permitted “yes-no” 
answers (“Do you feel that you were given 
enough specific information?”), although 
two permitted three alternatives (“In gen- 
eral, how adequately did the lectures con- 
vey the information?” and “How helpful 
was the student participation in learning 
the health material?” “Very, somewhat, not 
very."). Five items related to material, six 
to teaching. 

Personal health and community health 
segments of the course were rated separately 
on each item. 

In summary, the following information 
was obtained before and after the health 
training course: knowledge of illness and 
disease, attitudes toward selected illnesses, 
beliefs in the avoidability or unavoidability 
of those illnesses, and the extent to which 
the trainee saw himself as likely to carry 
out personal and community health-related 
behaviors. After training, course reaction 
was determined. 


RESULTS 


All data were run through standard 
Screening programs giving posttest 
data plotted against pretest data, re- 
gression lines, and correlation co- 
efficients. In most cases the ran- 
domization procedure adequately 
equated pretest measures, but the 
correlations between pre- and post- 
measures were high. Thus, the major 
analytic tool was a two-way (team 
versus method) nonorthogonal anal- 
ysis of covariance, with pretest scores 
controlled. Table 1 gives the means 
for each group on all tests. 
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Knowledge Gain 


The total group averaged 11.1 
points higher on the posttest, Exam 
2, than on the pretest, Exam 1, a 
significant overall difference (by t- 
test, p < .05). Thus, the total group 
did show significant information gain 
after the course. The covariance anal- 
ysis indicated that teaching team 
had a significant effect (p < .05) 
while method did not. The groups 
taught by Team 2, on the average, 
learned more than those taught by 
Team 1, But neither the teaching 
method nor the interaction of Team 
X Method produced a significant 
difference. Team 2, then, was su- 
perior, regardless of method, The 
covariate (pretest score), as ex- 
pected from the high correlation 
between the pre- and posttest, was 
significant, indicating that those who 
initially scored highest on the knowl- 
edge test tended to score highest on 
the posttest. 


Attitude Change 


Positive scores indicate illnesses 
which are “good” and “mild,” and 
negative scores indicate illnesses 
which are “bad” and “severe.” 8 The 
group as a whole did not show sig- 
nificant attitude change, but the 
analysis of covariance did show a sig- 
nificant difference by teaching method 
(p < .02). Here, there was no dif- 
ference between teaching teams and 
no Teams x Method interaction; but 
again, as expected, the covariate 
(initial attitudes toward illness) 
was significant. In general, illnesses 
tended to be viewed as bad and 
severe, but classes taught by the 


* The figures reported here are essentially 
mean summary scores over individuals and 
illnesses, using a procedure of weighting 
each scale response on the semantic dif- 
ferential according to its component loading 
on the attitudinal dimension, then, weighting 
each illness according to its relative “good- 
ness” or “badness.” 
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TABLE 1 
Means FOR EACH GROUP BEFORE AND AFTER TRAINING ON ALL TESTS 
Team 1 Team 2 "Totals 
Item 
Exam 1 | Exam 2 r | Exam! | Exam 2 r | Exam 1 | Exam2 
Knowledge scores 
Partieination 51.0) 59.9} .80 51.1) 62.3) .66 51.0} 61.0 
Lecture 52.3) 62.2) .61 50.3} 63.0] .69 51.1) 62.7 
Totals 51.8} 61.3} 50.5} 62.8 
Attitude aeo 1 
severity of illness 
CT —65.5| —55.2| .67 | —70.1| —56.1| .65 | —67.6| —55.6 
Lecture —68.5| —75.1| .50 | —66.2| —65.0| .57 | —67.1| —69.0 
Totals —67.3| —67.3 —67.2) —62.8| 
Belief change 
(avoidability of illness) 
Participation —246.1|—252.6| .58 |—225.8/—236.4) .80 |—237.0|—245.4 
Lecture —237.0|—247.0| .67 |—240.1|—251.5| .60 |—238.9|—249.7 
Totals —240.6|—249.2 —236.5|—247.7 
Expressed likelihood of follow- 
ing recommended personal 
behaviors 
Participation 47.1) 71.9| .45 47.9| 66.7| .56 47.5| 69.6 
Lecture 47.5) 73.5) .64 | 48.4| 62.7| .52 | 48.0] 66.9 
Totals 47.3| 72.9 48.3| 63.7 
Likelihood of engaging in com- 
munity health behavior 
Participation —22.4| —23.1| .68 | —11.7| —12.8| .35 | —17.6| —18.5 
Lecture —27.4| —32.1| .60 | —21.4| —25.1| .71 | —23.8| —27.8 
Totals —25.4| —28.6 —19.0| —22.0 


Note.—For Team 1, Knowledge Scores, Participation Group, N = 46, Lecture Group, 
N = 74; for all other items, Participation Group, N = 47, Lecture Group, N = 73. For 
Team 2, Knowledge Scores, Participation Group, N = 38, Lecture Group, N = 114; for all 
other items, Participation Group, N = 38, Lecture Group, N = 113. 


participation method—either by Team 
lor by Team 2—believed illnesses to 
be less severe after training than be- 
fore. In general, then, participation 
groups moved in the direction of more 
favorable attitudes toward illness, but 
it should be noted that a “favorable” 
attitude toward illness may not lead 
to illness-preventive behavior. 


Belief Change 
Table 1 presents changes in belief 
about the  avoidability (positive 


Scores) or unavoidability (negative 
scores) of illnesses.* 


“These scores were derived in the same 
manner as the attitude scores, using the 
component loadings on the avoidability 


For the total group, illnesses are 
viewed as more avoidable after tak- 
ing the course than before taking it. 
In the covariance analysis, however, 
only the covariate was significant. 
Thus, there was a significant correla- 
tion between pre- and posttest scores, 
but there was no difference in effect- 
ing belief-change by teams, methods, 
or by the interaction between them. 


Change in Commitment 


In general, trainees reported an re 
creased likelihood of following 5e 


component of judgment toward au 
rather than the evaluative or ait 
component, and weighting illnesses by 
relative avoidability or unavoidability. 
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haviors related to personal health, 
and a decreased likelihood of engag- 
ing in community health activity. 

For both analyses, the pretest score 
made a significant difference. For per- 
sonal health behaviors, however, there 
were no other significant differences. 
Thus, either team, by either method, 
was equally likely to raise the com- 
mitment to follow recommended per- 
sonal health behaviors. 

When community health was con- 
sidered, however, method was a sig- 
nificant source of variation. Thus, 
those in the participation groups 
initially saw themselves as unlikely 
to engage in community health be- 
haviors, and remained relatively un- 
changed by the training course. The 
lecture groups also initially expressed 
a lack of likelihood of engaging in 
community health behaviors, but saw 
these behaviors as even more un- 
likely after the course. It appears 
that the participation method made 
little change in the level of commit- 
ment to community action, while the 
lecture method actually lowered it, or 
conversely, increased a negative com- 
mitment. 


Course Reaction 


Guttman scales gave coefficients of 
reproducibility for material, presenta- 
tion, and overall course reaction that 
were .93, .90, .87, indicating ade- 
quate reproducibility. Intercorrela- 
tions between Guttman scores for 
personal and community health seg- 
ments indicate that students did 
make somewhat different judgments 
about the two segments, although the 
two were significantly correlated. Sim- 
ilarly, students distinguished between 
content and method. For personal 
health, content and presentation scores 
over individuals correlated .50, and 
for community health, .43. 

Mean course reactions for the 
total group are approximately half 
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TABLE 2 
Mean Course Reactions BY TEACHING 
TEAM AND METHOD: GUTTMAN SCALE 
SconEs 


Community 


Personal 
health section | health section 


rem 1 [team 2 Team 1 |Team 2 


Overall course reaction 


Participation 8.77 | 6.47 | 8.28 | 8.26 
Lecture 6.85 | 4.88 | 8.15 | 7.03 
"Total 6.38 8.19 


Reaction to presentation 


Participation 6.32 | 4.26 | 5.68 | 5.76 
Lecture 5.08 | 3.58 | 5.71 | 5.62 
Total 4.61 5.74 
Reaction to material 
Participation TAE) 3.26 | 3.68 
Lecture 2.75 | 2.34 | 3.15 | 3.18 
Total 2.71 3.28 


Note.—High score indicates unfavorable 
response. 


of the scale range (and medians also 
fall at about the midpoint of po- 
tential scores). Table 2 indicates that 
for personal health Team 2 was con- 
sistently superior to Team 1, accord- 
ing to students, and that lecture was 
consistently preferred to participa- 
tion. While it is conceded that Gutt- 
man scores cannot be considered in- 
terval scales, some indication of the 
significance of these findings may be 
obtained from analysis of variance, 
keeping in mind that the statistical 
assumptions of normally distributed, 
continuous data are not met, al- 
though volunteers were randomly as- 
signed to groups. 

For personal health, by more than 
twice the F-ratio needed for the .05 
level, both method (F = 16.7) and 
teaching team (F = 24.7) were sig- 
nificantly different, and there was no 
significant interaction between them. 
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It seems safe to say that both team 
and method produce different course 
reactions. 

For the community health section, 
however, there were virtually no 
differences between either team (F = 
1) or method (F = .2), and no F- 
ratio remotely approached signifi- 
cance. In short, neither the method 
nor the teaching team made a dif- 
ference. 


Discussion 


A two-way factorial design was 
used to test the effectiveness of 
lecture versus participation in health 
training of Peace Corps volunteers. 
Measures of knowledge gain, atti- 
tude change, belief change, and 
changes in commitment to action were 
tested by nonorthogonal analyses of 
covariance (controlling for pretrain- 
ing scores) to determine the effect of 
teaching team, teaching method, and 
the interaction between them. 

A review of the literature suggests 
participation is not superior in pro- 
ducing a knowledge gain. That hy- 
pothesis was supported by the present 
data. Lecture and participation 
groups gained the same amount of 
information, but one teaching team, 
using either method, was significantly 
better than the other. Thus, the 
teacher, not the method, appears to 
influence information gain. 

Related to knowledge change, one 
aspect of belief about illness, its 
avoidability or unavoidability, was 
affected by the training course. Vol- 
unteers viewed illness as more avoid- 
able after training than before. This 
belief was not differentially affected 
by either teaching team or method, 
however. 

A relatively recent review article 
(Stovall, 1958) predicts more atti- 
tude change by participation than 
by lecture. This, too, was supported 
by the present study. In general, the 
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participation groups saw illnesses as | 


"better" or “milder” after the course 
than before it. With illnesses, how- 
ever, it may well be that a favorable 
attitude toward illness is what one 
least wishes to encourage. One theo- 
retical position (Rosenstock, Hoch- 
baum, & Kegeles, 1960), suggests 
that health-related behavior will be 
less likely to occur when attitudes are 
“positive,” and illnesses are seen as 
mild and less bad. In this instance, 
then, significant positive change may 
be interpreted as arguing against the 
use of the participation method where 
illnesses are being considered. 

Given the specific sample, and the 
specific task of teaching knowledge 
about illness and behaviors which 
will prevent illness, it appears that 
the participation technique is not 
more effective in presenting knowl- 
edge (although the teacher or teach- 
ing team does differ) and it makes no 
difference in the one belief item 
measured. 

Bennett (1955) observed that most 
of the differences in behavior change 
noted in lecture versus discussion ex- 
periments could be attributed to the 
individual’s commitment to action, 
and that commitment was not di- 
rectly related to teaching method. 
The present study controlled for com- 
mitment by asking all groups to rate 
their likelihood of following certain 
behaviors. Consistent with Bet 
the present study found that metho 
did not differentially increase poii 
commitment toward either persona 
or community health behavior. E 
leeture method, however, significant? 
inereased the negative commitmen 
toward community health behaviors. y 

Perhaps the difference may be dis 
plained by Bennett’s finding that í 
cussion members tended to NE 
slightly higher commitment, but 
differences in actual behavior. 2 

If increased knowledge, changing 
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attitudes, and commitment to recom- - 


mended personal behaviors are our 
objectives, the lecture method is 
probably preferable. While some 
teachers are more effective in impart- 
ing knowledge, experienced teachers 
in the present study did equally well 
with either method in producing 
commitment to personal health be- 
haviors. 

It should be noted that in no case 
did a significant Teaching Team X 
Teaching Method interaction occur. 
This lack of interaction, however, 
may have resulted from the choice of 
instructors experienced in both meth- 
ods. Under other conditions, signifi- 
cant interactions may be observed. 

For course reaction, both team and 
method made a significant difference 
in the personal health segment. Team 
2 was preferred to Team 1, and 
lecture was preferred to participa- 
tion. 

For the community health seg- 
ments, neither team nor method made 
a significant difference. 

In neither personal nor community 
health was there a significant in- 
teraction between team and method. 

The present study suggests that 
where students are motivated and 
relatively interested, they prefer lec- 
ture to participation—perhaps be- 
cause more material can be pre- 
sented in a shorter amount of time. 
Spontaneous comments from stu- 
dents suggest that, in health courses, 
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they have too little information to 
utilize discussion, role-playing, or 
other participation methods effec- 
tively. 

REFERENCES 


Auuport, G. W. The nature of prejudice. 
New York: Addison-Wesley, 1954. 

Bem, D. J., Wautacu, M. A, & Kogan, N. 
Group decision making under risk of 
aversive consequences. Journal of Per. 
sonality and Social Psychology, 1965, 1, 


Bennett, E. B. Discussion, decisiou, com- 
mitment, and consensus in “group de- 
cision,” Human Relations, 1955, 3, 251- 
273. 

Fisusemy, M., & Raven. B. H. The AB 
scales: An operationa! definition of belief 
and attitude. Humen Relations, 1962, 15, 
35-44. 

Knecg, D. & CnureHrmu, R. Theory 
and problems of social psychology. New 
York: McGraw-Hill, 1948. 

Lewin, K. Group decision and social change. 
In T. M. Newcomb & E. L. Hartley 
(Eds), Readings in social psychology. 
New York: Holt, 1947. Pp. 330-344. 

Rosenstock, I. M., Hocusaum, G. M., & 
KeceLes, S. Determinants. Paper pre- 
sented at the Golden Anniversary White 
House Conference on Children and Youth, 
Washington, March 1960. 

Rusa, H. Outcomes of lecture and dis- 
cussion procedures in three college courses. 
Journal of Experimental Education, 1954, 
22, 385-394. 1 f 

Spence, R. B. Lecture and class discussion 
in teaching educational psychology. Jour- 
nal of Educational Psychology, 1928, 19, 
452-402. 

Srovaur, T. F. Classroom methods, II: Lec- 
ture vs. discussion. Phi Delta Kappan, 
March, 1958, 255-258. 


(Received July 8, 1966) 


Journal of 
1967, Vol. 58 


Educational Psychology 
» No. 5, 266-272 


LEARNING FROM PROSE MATERIAL: 


LENGTH OF PASSAGE, KNOWLEDGE OF RESULTS, AND 


POSITION OF QUESTIONS 


LAWRENCE T. FRASE 
University of Massachusetts 


72 college Ss read a 2000-word biographical prose passage. A 2 X 
3 X 2 factorial analysis assessed the effects of position of factual ques- 
tions within the text, length of passage, and presence of knowledge 
of results. Posttest analysis focused upon (a) questions which had 
occurred during reading (retention questions), and (b) questions re- 
lated to the section of the prose passage not tested by the retention 
questions (incidental questions). All 3 factors were significant for 
retention questions; only the position of questions was significant for 
incidental questions. A moderate length passage was optimal for re- 
tention scores while scores on incidental questions tended to improve 
with longer passages. When questions occurred after passages both 
retention and incidental scores were high. Elicitation of “mathema- 
genic” behavior provided the most concise interpretation of the data. 


The aim of the present study was 
to determine how factual questions 
might be used to improve retention 
of prose materials. Such questions, 
placed within a prose passage, might 
have specific effects, improving reten- 
tion only on sections of the prose di- 
rectly relevant to the question. The 
same questions might also have a 
more general effect, influencing the re- 
tention of prose information which is 
incidental to the questions asked. 

Rothkopf (1965, 1966) has investi- 
gated issues raised by the general 
problem posed above. He found that 
questions can have both a specific and 
general facilitative effect upon reten- 
tion of prose material, He also found 
that a simple instruction to read care- 
fully facilitated retention in general. 
Rothkopf maintains that questions 
elicit a general class of responses, 
called “mathemagenic behaviors,” 
which aid in the acquisition and re- 
tention of information. The “eyber- 
netic” approach complements the 
view held by Rothkopf. Smith and 
Smith (1966) imply that questions 
may provide an internalized criterion 
for attending to the content of a 


prose passage. According to this view, 
question-relevant prose content would 
elicit appropriate mathemagenic be- 
haviors. Placing the question in front 
of the prose passage would be 4 
necessary condition for this effect 
to occur. Rothkopf (1966) concludes 
from his data, however, that specific 
effects of question are obtained re- 
gardless of the location of questions. 
In the present study this problem 
was explored by placing factual 
questions either before or after pas- 
sages to which they relate. 

Another question explored by 
Rothkopf was the use of knowledge of 
results (KR). When questions come 
before related prose passages @ 
knowledge of results is given, the 
prose material may interfere with the 
retention of the questions. In short, 
the reading of prose material, in this 
situation, may be viewed as a form of 
interpolated learning between bees 
tions. Berlyne (1966) has suggeste 
the need for research in this area. 
Intuitively it seems that if a question 
occurs without KR, reading the rele- 
vant prose passage after the question 
should provide KR and hence reten- 
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tion test. scores should be improved. 
Rothkopf’s (1966) data indicated 
that this may not be the case. His 
results suggest the possibility of an 
interaction between KR and position 
of questions, but the nature of this 
interaction was not clear from his 
data. Rothkopf’s major emphasis was 
on the comparison of his individual 
groups with each other and with a 
control group. The present study, 
rather than focusing upon the con- 
trol condition, provided information 
concerning possible interaction of the 
variables studied. 

A final variable of potential rele- 
vance is the amount of prose mate- 
rial interpolated between questions, 
Is there an optimal rate of question- 
ing and does this rate differentially 
affect the retention of information 
which is relevant or incidental to the 
questions asked? Ausubel’s (1962) 
view of meaningful verbal learning, 
contrary to current “active response” 
requirements of programmed instruc- 
tion, suggests that retention may be 
improved if subjects (Ss) are al- 
lowed to read passages as a whole. In- 
stead of spacing prose reading by 
interpolating questions between pas- 
sages, Ss reading prose under whole 
conditions would be more likely to 
relate the facts contained in the pas- 
sage meaningfully. An alternative 
view is that if questions occur in the 
prose material they should occur in 
close proximity to the relevant con- 
tent. Programming, which uses test- 
like events, necessitates breaking the 
material into small chunks to facili- 
tate associations between questions 
and content. In the present study, size 
of passage between questions was 
varied. The two types of posttest 
questions mentioned above (relevant 
and incidental) pertain to prose con- 
tent in which relevant test items oc- 
cur or do not occur. Perhaps the pro- 
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gramming approach and Ausubel's 
view are both correct, in which case 
the effect of length of material would 
differ for retention and incidental 
questions. 


Mertuop 


Subjects and Experimental Design 


The experiment was presented as a lab- 
oratory assignment to 79 students of intro- 
ductory educational psychology. 

A 2 X 3 X 2 factorial design was used, 
the three factors being: (a) questions before 
or after the reading passages, (b) length of 
passage (10, 20, or 40 lines long), and (c) 
KR present or absent following the ques- 
tions. 


Stimulus Material 


A continuous prose passage of biographi- 
cal material on William James was selected 
from Psychology: The Science of Mental 
Life, by Miller (1962)* The passage was 
divided into 20 paragraphs of 10 lines each. 
The material was reproduced on mimeo- 
graph with each paragraph a separate sheet 
of 4 X 11-inch paper. Two multiple-choice 
questions (5 alternatives) which required 
the recall of specific factual information 
(such as the number of children in the 
James family, or a course of study under- 
taken by William James at a particular 
time) were constructed for each paragraph, 
These two questions did not overlap in 
terms of their content. The 20 questions re- 
lating to the second part of the paragraphs 
were used as retention questions on the 
posttest. They occurred during the learning 
session (either before or after passages) and 
also on the posttest. The other set of 20 
questions, relating to the first half of the 
paragraphs, was used to test for incidental 
learning. Pretesting indicated that the two 
sets of questions were at the same chance 
level of difficulty. 


Procedure 


The experiment was given on 2 consecu- 
tive days. The Ss were randomly assigned 
to one of the 12 groups for 1 of the 2 days. 
This meant that ¥ of the Ss under each 


2 Permission for the experimental use of 
these copyrighted materials was kindly 
granted by the publishers, Harper & Row, 
Inc. 49 East 33rd Street, New York, N. Y. 
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TABLE 1 


COMPARISON OF PERCENTAGES CORRECT 
with Dara or RorHkoPr (1966) 


LBA| B | BA| A | AA | Cor 
Rothkopf 
Retention | 78 | 65 | 78 | 63 | 82 | 29 


Incidental | 36 | 30 | 35 | 43 | 40 | 33 
Frase 

Retention | 85 | 61 | 87 | 79 | 91 | 68 

Ineidental | 52 | 54 | 53 | 73 | 70 | 61 


Note.—LBA = all questions and answers 
were given, then Ss read passages (in the 
present study the 40-line length most nearly 
approaches this condition); B-questions 
occurred before each paragraph; BA = 
questions and answers given before each 
paragraph; A = questions given after each 
paragraph; AA = questions and answers 
given after each paragraph; Control = 
read prose passages. For the Rothkopf 
groups, N = 20; for Frase, N = 18 for all 
ties onto: LBA (N = 6) and Control 


condition were run on the first day and % 
on the second day. Two dependent meas- 
ures were obtained immediately after the 
reading task; (a) the number of correct 
responses to the 20 questions which had 
occurred in the learning task (retention 
questions), and (b) the number of correct 
responses to the 20 questions covering ma- 
terial not questioned in the reading task 
(incidental questions). The retention and 
incidental test scores were analyzed sepa- 
rately. They might have been included as 
a dimension in a repeated measures design, 
but the resulting confounding of some of 
the conditions with KR makes interpreta- 
tion ambiguous. 

A control group of seven Ss randomly 
assigned from the same class was run along 
with the other Ss to indicate & base-line 
level of responding by simply reading 
through the prose material without the test 
questions. 

The reading material took the form of a 
stack of unbound Pages. On one sheet of 
paper S found a paragraph of prose material, 
on another sheet a question over that ma- 
terial, and on another sheet KR was given 
by repeating the number and content of the 
correct alternative along with the stem of 
the question. The sequence of paragraph- 
question-KR that a particular S saw de- 
pended upon the experimental group to 
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which he was assigned. As an example, one 
S might read the following six pages: (a) 
Paragraph 1, (b) Question 1, (c) Answer 1, 
(d) Paragraph 2, (e) Question 2, (f) Answer 
2. Another S might see the same material 
in a different sequence: (a) Question 1, (b) 
Answer 1, (c) Paragraph 1, etc., or, if the 
size of the material were varied; (a) Para- 
graph 1, (b) Paragraph 2, (c) Question 1, 
(d) Answer 1, (e) Question 2, (f) Answer 2. 
If KR was given, the answer always fol- 
lowed the question to which it related. 

The Ss were directed to seats on one side 
of tables (in a laboratory classroom) upon 
which the reading materials had previously. 
been placed. They were cautioned not to 
look through the materials until told to do 
so, After Ss had been seated, the experi- 
menter read standard instructions stating 
that this was an experiment to see how 
much people can learn from reading mate- 
rial. The Ss were told to read each page of 
their material (turning each page face down 
after they read it) and not to turn back 
once they had read a page. They were told 
that when they encountered a question in 
the text they were to try and get the an- 
swer before going on to the next page. They 
were told that a test would be found at the 
end of the reading material, and that they 
were to complete the test immediately after 
the reading task. 

The Ss were also informed that some of 
the materials were shorter than others and 
hence some Ss might finish sooner than 
others, The procedure was again explained. 
The experimenter insured that the instruc- 
tions were understood before proceeding 
with the experiment. 


RESULTS 


Comparison of Two Studies 


Table 1 presents data from groups 
in the present study which are com- 
parable to groups used in Rothkopi's 
study. 

Rothkopf used a 5,000 word pas- 
sage from Rachael Carson’s The Sea 
Around Us, and fill-in questions 
rather than multiple-choice items. 
Hence, the two studies differ in terms 
of size and content of material, ani 
response mode. In spite of these dif- 
ferences Table 1 reveals that the ne 
tive standing of the groups was qu! 
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TABLE 2 
Summary oF MAIN EFFECTS 
Retention questions Incidental questions 
M F df M F df 

Question position 16.10 1/60** nk 

Before 15.14 FROM Mcr (rtg 

After 17.00 14.28 
Length of passage 3.94 2/60* 

10 15.96 11.33 

20 16.92 12.96 

40 15.33 13.18 
KR 52.20 1/60** 

Present 17.78 12.28 

Absent 14.36 12.66 

* p < 02. 

** p < 001 


similar across the two studies. The 
rank-order correlation of the means of 
the two studies is significant at the 
.01 level, attesting to the generality 
of Rothkopf's findings. The difference 
in length of material and response 
mode accounts for the higher average 
scores in the present study. 

The reader is referred to the note 
under Table 1 for an explanation of 
the group labels. It can be seen that 
in both studies the group which re- 
ceived questions (without KR) before 
prose passages (Group B) performed 
somewhat below the Control Group 
mean on incidental questions. The 
difference between the Control and B 
Group means was not statistically 
significant, but the consistency across 
studies suggests that there may ac- 
tually be a depressing effect of fac- 
tual prequestioning upon incidental 
learning which the test items were 
not sensitive enough to detect. 


Retention Questions 


There was a significant main ef- 
fect of all three factors upon reten- 
tion questions (Table 2). Presenting 
questions before paragraphs had 
the least facilitating effect upon re- 


tention. There was a significant inter- 
action between the position of ques- 
tions and whether KR was available 
or not (F = 442, df = 1/60, p < 
.05). When KR was available, the 
means for the before-after position 
of questions were 17.33 and 18.22; 
when KR was not available, the 
means were 12.94 and 15.78, respec- 
tively. In short, the position of the 
retention questions did not make 
much difference if KR was provided. 
If KR was not provided, then ques- 
tions were more effective following 
the passages. This result was some- 
what surprising. It was anticipated 
that Ss would obtain KR (when it 
was not given with questions) pro- 
viding that the questions occurred 
before the prose paragraphs. Con- 
trary to the cybernetic hypothesis, 
Ss did not seem to internalize ques- 
tions and focus their reading behav- 
iors on question-relevant content. 
Analysis of the length of passages 
reveals that the 20-line passage was 
optimal for facilitating retention. Pre- 
sumably, if passages are too long, 
associations between questions and 
content are difficult to establish. If 
passages are too short, continuity 
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among prose content is broken. This 
result is especially interesting because 
the effect of length upon incidental 
questions was quite different (see be- 
low). 

The result of including KR upon 
retention items confirms the expecta- 
tion that if students are given the 
answers to questions which will occur 
on an immediate posttest, their per- 
formance will be relatively high on 
that test. 


Incidental Questions 


Table 2 reveals one significant ef- 
fect for incidental questions. Placing 
retention questions after prose pas- 
sages tended to increase the amount 
of information acquired for other 
portions of the material. A similar ef- 
fect was obtained with the retention 
questions (contrary to Rothkopf’s 
nonsignificant findings in this re- 
gard), but the difference between 
the before-after position was even 
more pronounced for incidental ques- 
tions. The results clearly indicate 
that, if questions are to have a gen- 
eral facililative effect upon retention 
of prose material, then the questions 
should occur after passages, 

The differences between levels of 
passage length were not significant for 
incidental questions. In contrast to 
the significant curvilinear relation- 
ship for retention questions across 
length of passage, there was a gradual 
improvement in scores on incidental 
questions with the larger passages. 
"This trend, in accordance with Ausu- 
bel's (1962) view, implies that read- 
ing larger passages may be an opti- 
mal procedure when relevant guidance, 
in the form of short tests, does not 
occur in the prose material, Hence, 
support is given to both the small step 
and whole reading approaches depend- 
ing upon whether one emphasizes re- 
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tention of selected facts or retention of 
the total prose passage. 


Discussion 


The results of the present study 
support the data of Rothkopf (1965, 
1966) which indicate that short tests 
included with prose passages facili- 
tate retention of specific and inci- 
dental information. Placing test ques- 
tions after the prose passages was 
the optimal procedure for both spe- 
cific and general retention. There are 
two relevant hypotheses which re- 
late to this result. The first hypothesis 
states that placing test questions 
after passages requires Ss to review 
implicitly content which has just 
been read; therefore retention of the 
preceding material is facilitated. 
This hypothesis asserts that questions 
work in a backward manner, or- 
ganizing and repeating previous 
prose content. Ausubel, Schpoont, and 
Cukier (1957) found that asking Ss 
to remember course material after 
they had read the material did not 
facilitate retention. It is difficult to 
see how Ss could make use of ques- 
tions in a review capacity unless 
they were able to reproduce the pre- 
ceding passage. 

The alternative hypothesis asserts 
that questions act in a forward man- 
ner optimizing “mathemagenic” be- 
haviors on passages following the 
questions. Questions which occur be- 
fore prose passages (and are relevant 
to those passages) tend to limit the 
general facilitating effects of test-like 
events since they relate to specific 
content within the passages. In effect, 
placing questions after passages (as 
review) alters their function. The 
questions now provide a general test- 
taking orientation applicable to the | 
paragraphs which follow. The for- 
ward hypothesis accounts for several 
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facts. First, Rothkopf (1966) ob- 
tained relatively high performance 
on incidental questions from a group 
which received only a general in- 
struction to read carefully. Such an 
instruction, although not frequent 
enough to maintain attentive re- 
sponses, should provide some  facili- 
tation. Second, when questions were 
placed before paragraphs in the pres- 
ent study, retention of incidental in- 
formation was relatively low. Fac- 
tual prequestioning evidently limits 
the range of attentive behaviors. 
Finally, recall of retention questions 
(when they occurred before passages) 
was high relative to incidental ques- 
tions—a repetition effect—but re- 
tention was even higher when the 
questions followed paragraphs (a 
repetition and general mathemagenic 
effect). 

A critical issue concerning the for- 
ward and backward hypotheses is the 
degree to which retention of the in- 
formation in one paragraph is facili- 
tated by placing a relevant question 
before it as opposed to an irrele- 
vant question. In the present study 
and in Rothkopf's (1966), questions 
which occurred after a relevant para- 
graph also acted as irrelevant ques- 
lions preceding the next paragraph. 
Further experimentation, in which a 
paragraph is either preceded or fol- 
lowed by a question (but not both), 
might contrast the relative strength 
of the repetition-review and attentive 
effects of questions. 

According to the cybernetic hy- 
pothesis, questions must occur before 
prose passages to focus effectively 
reading behaviors. If prequestioning 
were a potent variable in terms of 
controlling inspection behaviors, then 
lack of KR (with prequestioning) 
should have resulted in poorer reten- 
tion of incidental prose material, 
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which it did not. If focusing did oc- 
eur, then it was not the result of 
epistemic behaviors (Berlyne, 1966), 
that is, a search for KR, but the 
result of a more general orientation in- 
duced by the factual questions. Ex- 
periments should be conducted to de- 
termine the range of questions which 
produce general or specific facilitat- 
ing effects. In terms of the previous 
discussion, an alternative means of 
inducing general facilitative effects 
would be to manipulate the function 
of questions by changing their posi- 
tion. 

To summarize, the present study 
extended the findings of Rothkopf 
(1966). Data indicate that test ques- 
tions induce appropriate attentive re- 
sponses when placed after the prose 
passages. Optimal spacing of test 
items in the prose material may differ 
for retention of test-relevant as op- 
posed to incidental prose content. An 
important unresolved issue concerns 
the relative strength of forward and 
backward effects of questions, Back- 
ward effects relate to review and 
repetitive effects of relevant ques- 
tions which follow paragraphs, For- 
ward effects, apparently quite potent 
in the present study, refer to general 
orientation or “mathemagenic” re- 
sponses which occur when unrelated 
questions precede passages. Another 
forward effect, suggested by the cy- 
bernetic hypothesis, concerns specific 
orientations when related questions 
precede passages. The latter did not 
appear as a strong factor in this 


study. 
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DEVELOPMENTAL CHANGES IN THE RECOGNITION 
OF BEGINNING SEGMENTS OF ENGLISH WORDS 


ROBERT F. STANNERS' ann DOLORES H. SOTO? 
Washburn University 


This experiment was designed to investigate the ability of English- 
speaking children to distinguish, by means of a visual-reading method, 


among highly frequent initial 3-let 


ter word beginnings (consonant- 


consonant-vowel), infrequent word beginnings, and CCVs which never 
occur in English. A modified method of paired-comparisons was used 
to present the CCVs to 68 3rd graders, 70 6th graders, and 71 9th grad- 
ers. The discriminatory ability for all grades was significantly above 
chance, p < .0001. The 6th and 9th graders were not significantly dif- 
ferent from each other, but both were better than the 3rd graders, 
p < 001. It appeared that discriminatory ability for the materials was 
already well begun by the end of the 3rd grade, virtually maximal 
by the end of the 6th grade, and did not differ with respect to sex. 


Linguists Chomsky and Halle 
(1965) have suggested that from the 
speech around him a child learns a 
grammar which allows him to form 
the admissible phonological combina- 
tions of certain letters but not others. 
In English, for example, words do not 
begin with sb, ny, dz, or many 
other combinations of letters which 
may occur in other languages. 
Church (1961) reports that even by 
age 2 or 215, children have learned 
the rules for combining consonants 
and vowels well enough to be able 
to detect alien combinations. Carroll 
(1960) is of the opinion that the 
overall frequency with which lan- 
guage items occur in the speech 
heard by the child has a positive 
relation to the developmental se- 
quence in which the items are learned. 
As Chomsky and Halle (1965) put 
it, the child hears brik (brick) but 
not blik or bnik. Since he hears words 
like black and blanket, he builds a 
grammar that contains the rule that 
blik is admissible, even though it is 
not meaningful; but, since he hears 


3 Now at Oklahoma State University. 

Undergraduate research participant un- 
der the National Science Foundation pro- 
gram. 
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no words beginning with bn, he will 
reject bnik as a possible English word. 

Two studies (Brown & Hildum, 
1956; Messer?) have shown that cor- 
rect aural identification of verbal 
units is a function of their proba- 
bility of occurrence in English. 
Brown and Hildum (1956) com- 
pared the ability of English speakers 
to identify nonsense words having 
unlawful (that is, not following the 
phonological rules of English) initial 
clusters with their ability to identify 
lawful nonsense words and low-fre- 
quency English words. The three 
types of words were presented aurally 
and adult subjects (Ss) were asked to 
spell the utterance. It was found that 
more low-frequency English words 
were correctly identified than lawful 
nonsense words, and there were more 
correct identifications of the lawful 
nonsense words than of the items with 
unlawful initial clusters. Messer per- 
formed his study with 4-year-old chil- 
dren to determine whether they could 
distinguish phonologically lawful non- 


3A brief report of this study appears on 
page 61 of the 1964-1965 Fifth Annual Re- 
port from the Center for Cognitive Studies, 
William James Center for the Behavioral 
Sciences, Harvard University, Cambridge, 
Massachusetts. 
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TABLE 1 


CCVs wits IDENTIFICATION NUMBERS, FRE- 
QUENCIES OF OCCURRENCE, AND SCALE 
SCORES rog THE VARIOUS GRADES 


Identifica- Fre- ibn 
tion CCV |quency of 
number occurrence! Third | Sixth | Ninth 
1 WHI 125 2.349 | 3.863 | 3.196 
2 STA 91 2.536 | 3.511 | 3.406 
3 THI 234 2.545 | 3.549 | 3.266 
4 CHA 62 2.215 | 3.647 | 3.315 
5 PLA 70 2.570 | 4.028 | 3.553 
6 PRO 53 2.137 | 4.037 | 3.703 
7 THO 52 2.182 | 4.013 | 3.393 
8 DWE 1 1.225 | 1.528 | 1.727 
BLE 1 1.890 | 2.893 | 2.758 
10 SCU 1 1.866 | 2.374 | 2.495 
11 SNU 1 1.797 | 2.047 | 2.344 
12 SWO 1 1.685 | 2.422 | 2.597 
13 WRA 1 2.041 | 2.791 | 2.596 
14 W. 1 2.099 | 2.819 | 2.756 
15 SBA 0 1.200 | 1.586 | 1.933 
16 FWO 0 1.042 | 1.352 | 1.500 
17 MBO 0 0,862 | 1.483 | 1.475 
18 RWA 0 0.847 | 1.393 | 1,158 
19 NWE 0 0.916 | 1.262 | 1.226 
20 FPA 0 0.986 | 1.392 | 1.089 | 
21 DZO 0 1.000 | 1.000 .000 


Sense words from unlawful nonsense 
words. The material was presented 
aurally and the results showed that 
about 70% of the time the children 
were able to distinguish the lawful 
nonsense words from the unlawful 
ones. Gibson, Osser, and Pick (1963) 
presented pronounceable and un- 
pronounceable trigrams tachistoscop- 
ically to first- and third-grade chil- 
dren and found that units which fit 
the rules of English, in terms of 
pronounceability, were recognized 
More accurately than those which 
did not. In a previous study (Gibson, 
Pick, Osser, & Hammond, 1962, p. 
555) it was hypothesized that “read- 
ing consists of decoding graphic 
material to the phonemic patterns of 
spoken language which have already 
been mastered when reading is be- 
gun” and that the observation of 
grapheme-phoneme correspondences 
leads to skilled recognition. 

The present study was concerned 
with the ability of English-speaking 
children (714-1514 years) to dis- 
tinguish highly frequent initial 3- 
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letter word beginnings from those of 
low frequency, and from those which 
never occur in English. Samples of 
third-, sixth-, and ninth-grade chil- 
dren were used to determine if knowl- 
edge of English phonological rules, 
as indicated by a visual-reading 
method, is a developmental sequence 
beyond the age of 7% or 8 years, 
and whether there are sex differences 
in the level of this knowledge. 


MzrHoD 


Subjects 


The Ss were 209 publie school children 
from two elementary schools and one junior 
high school in Shawnee County, Kanses. 
Approximately one-half of the third- and 
sixth-grade samples came from the same 
area as did the ninth-grade sample. The 
other one-half of the third- and sixth-grade 
samples came from another part of the 
county. As nearly as could be determined, 
all members of the total sample were of the 
same general middle-class background. All 
Ss spoke English as a native language and 
all were Caucasian except for two Ss who 
were Negro. Specifically, there were 68 third- 
grade children (34 boys ranging in age from 
89-123 months, and 34 girls, 88-112 months 
of age); 70 sixth-grade children (41 boys, 
137-162 months of age, and 29 girls, 137-160 
months of age); and 71 ninth-grade children 
(29 boys, 173-189 months of age, and 42 
girls, 172-185 months of age). 


Materials 


Three lists, each containing seven con- 
sonant-consonant-vowel (CCV) combina- 
tions, were constructed. One list was com- 
posed of the seven most frequently occurring 
initial (ie. occurring in positions 1-3 of & 
word) CCVs obtained from the tables pub- 
lished by Mayzner, Tresselt, and Wolin 
(1965). Another list of initial CCVs which 
occurred only once in the sample was ob- 
tained from the same source. The third list 
of CCVs which could not occur under the 
phonological rules of English was obtained 
from Whorf's formula (Brown, 1965, p. 268). 
Table 1 presents the 21 CCVs and their 
frequencies of occurrence in a collection O 
20,000 English words, 4-7 letters in length. 
Frequencies were calculated by summing 
over word lengths 4-7 for letter positions 
1-3. 
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Procedure 


The arrangement of the CCVs for presen- 
tation was accomplished by arbitrarily as- 
signing a number between 1 and 7 as an 
identification number to each of the highly 
frequent (HF) CCVs; the numbers 8-14 to 
the infrequent (LF) CCVs; and the num- 
bers 15-21 to the nonoccurring (N-O) CCVs. 
CCV 1 was paired with CCVs 8-12 and 
15-19; CCV 2 with CCVs 13, 14, 8-10, and 
20, 21, 15-17; CCV 3 with CCVs 11-14 and 
8, and 18-21 and 15. This system was con- 
tinued until each of the highly frequent 
CCVs had been paired 10 times with CCVs 
from the other two groups. The same system 
was used to pair the LF CCVs with the N-O 
CCVs. The system resulted in a total of 105 
pairings, which was exactly one-half the 
number which would have been required if 
the usual method of paired-comparisons had 
been used. The CCV pairs were arranged in 
random order with the restriction that each 
reappearing CCV was as far as possible from 
its last appearance in the list. Each CCV 
was presented 10 times, and each time it 
was paired with a different CCV; however, 
there were no within-group pairings. Right- 
left placement of “correct” selections (i.e., 
the CCVs most frequent in English) within 
the entire list of 105 pairings was controlled 
with one-half of the correct choices occur- 
ring on the left, and the other one-half on 
the right. Sequential patterning of correct 
right-left responses was arranged randomly. 
The CCV pairs were presented to each S in 
a small booklet of 105 pages with one pair- 
ing on each page. All booklets were identical 
except; that, one-half of them were arranged 
in reverse order to control for practice and 
fatigue. 

The Ss were run in groups in their re- 
Spective classrooms during May, 1966. The 
session was untimed, with the administrator 
reading aloud the instructions and offering 
four visual examples of paired-comparisons 
which were similar to those in the materials. 
The Ss were asked to write their sex and 
birth date on the back of the booklet, and to 
underline the letter combination on each 
page which they had seen or heard most 
often as the beginning of an English word. 
All questions by Ss were answered before 
and during the administration. 


RESULTS AND DISCUSSION 
The first phase of the analysis was 
to test the mean proportion of judg- 
ments correct, that is, in conformity 
with the objective frequency count, 
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against a chance hypothesis, a pro- 
portion of .500. For the third grade 
the mean was .795, t (67) = 16.52, 
p < .0001; for the sixth grade the 
mean was .908, t (69) = 4427, p < 
0001; for the ninth grade the mean 
was .923, t (70) — 58.12, p « .0001. 
Paired-comparison scores, presented 
in Table 1, were calculated for each of 
the CCVs by the incomplete matrix 
method (Torgerson, 1958, Ch. 9). A 
separate analysis was done for each 
sample of Ss from the three grades. A 
value of 1.00 was used as an arbitrary 
origin point for the scale. The means 
of each set of seven scores in a given 
frequency category are plotted by 
grades in Figure 1. A two-way analy- 
sis of variance was performed on these 
data, and all effects were found to be 
statistically highly significant. For 
grades, F (2/36) — 129.42, p « .001; 
for frequency, F (2/18) = 110.20, 
p < .001; and for the interaction of 
Grades X Frequency, F (4/36) = 
19.02, p < .001. Figure 1 indicates 
that the significant interaction term 
may be attributed largely to the 
difference in the slopes of the trend 
lines of the sixth and ninth grades 
on the one hand, and the third grade 
on the other. The steeper slopes of 
the trend lines for the sixth and ninth 
grades, as compared to the third, 
indicate a higher level of discrimina- 
tion among the CCVs in the judg- 
ment of familiarity. Because of the 
significant interaction effect and the 
relative closeness in the means of the 
sixth and ninth grades, as compared 
to the third, two-group comparisons 
were made among the means of the 
grades by Scheffé’s test (Edwards, 
1962). The results of the tests are 
presented in Table 2. The difference 
between the means of the sixth and 
the ninth grades in the HF category 
is a result of the sixth graders show- 
ing a slightly higher level of dif- 
ferentiation among the CCVs within 


276 Rosert F. Stanners AND DoronEs H. Soro 


VALUE 


MEAN SCALE 


FREQUENCY 


ETH GRADE 


MIT GRADE 


.F HF 
CATEGORY 


Fia. 1. Average familiarity scale score for the three grades as a function of CCV frequency. 


the HF category than did the ninth 
graders. The sixth grade’s proportion 
scores for the various pairings of the 
HF CCVs with CCVs from the other 
two categories were somewhat more 
variable than were those of the ninth 
grade. The greater variability indi- 
cated that the sixth graders per- 
ceived slightly larger differences in 
familiarity among the HF CCVs and 
is reflected in the small increase in 
the scale scores, All other significant 
differences were between the third 


TABLE 2 
VALUES or F ASSOCIATED WITH MEAN 
COMPARISONS or SCALE SCORES 
Usine SHEFFE’S TEsT 


Frequency Level 
Comparison —————————— 
N-O LF HF 
3rd vs. 6th 15.13 | 42.27 | 237.04 
3rd vs. 9th 14.11 | 50.54 | 125.45 
6th vs. 9th 0.02 0.37 | 17.60 


Note.—F = 13.86 is needed for signifi- 
cance at the .01 level; F = 25.94 at the .001 
level. N-O nonoccurring; LF = low fre- 
quency; HF = high frequency. 


and sixth grade and the third and 
ninth grade. The tests of the mean 
proportions, the analysis of variance, 
and the multiple comparisons tests 
indicate that a fairly high degree of 
discriminatory capacity is learned by 
the end of the third grade and that 
the level of learning is virtually 
asymptotic by the end of the sixth 
grade. 

The next analysis was concerned 
with differences in the mean propor- 
tion correct among the three grades 
by sex. A two-way analysis of vari- 
ance for unequal cell frequencies 
(Winer, 1962, Ch. 5) was performed 
on the proportion data. For grades, 
F (2/203) = 51.14, p < .001; for sex, 
F (1/203) = 0.00; and for the in- 
teraction of Grades x Sex, F (2/203) 
= 0.00. As would be inferred from 
the value of the latter two terms, 
the differences between the means for 
boys and girls over the three grades 
were extremely small. For girls and 
boys, respectively, in the third grade 
the means were .793 and .798; m 
the sixth grade, 912 and .906; and 
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in the ninth grade, .923 and .923. The 
overall values for girls and boys 
were .876 and .876. The two-group 
differences among the means for the 
three grades were tested by Scheffé's 
test (Edwards, 1962). For the third 
grade versus the sixth, F (2/135) — 
44.02, p « .001; for the third grade 
versus the ninth, F (2/136) — 56.897, 
p « .001; and for the sixth grade 
versus the ninth, F (2/138) — .798. 
Virtually all of the large main effect 
for grades which was found in the 
analysis of variance may be attributed 
to the difference between the sixth and 
ninth grades on the one hand, and the 
third grade on the other. This 
finding may be adduced in support of 
the conclusion that the development 
of discriminatory capacity for initial 
CCVs is largely complete by the 
end of the sixth grade. The gen- 
erality of the conclusion is limited, of 
course, by the sample of materials 
used in the present study. The nearly 
identical performance for boys and 
girls is somewhat surprising in view 
of the common finding that girls are 
superior on verbal tasks (Anastasi, 
1958, Ch. 14). Additionally, boys 
have been found to have a greater 
incidence of reading difficulties in 
grade school (Bentzen, 1963) . 

Since the familiarity scale scores 
for the CCVs could be useful in 
other research, the stability of the 
scores was checked by calculating 
the correlations among the three sets 
of scores from the three independent 
samples. For the third grade and 
sixth grade, r — .94; for the third 
and ninth, r = .97; and for the sixth 
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and ninth, r — .97. The rank-ordering 
of the CCVs, even within a fre- 
quency category, was very similar 
across the three grades. 
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An experiment was designed to assess 2 methods of facilitating the ac- 
quistion of paired associates. Each of 96 3rd- and 96 6th-grade 
children was asked to learn a list of 24 pairs by a study-test method. 
The pairs were either pictures of objects or the printed names of those 
objects. As each pair was presented, the experimenter uttered 1 of 4 
kinds of verbalization: (a) the names of the objects; (b) the names of 
the objects connected by a conjunction; (c) the names of the objects 
connected by a preposition; and, (d) the names of the objects con- 
nected by a verb. For both older and younger Ss, pictorial materials 
produced more efficient learning than printed materials and verb con- 
nectives uniformly facilitated acquisition. The relationship between 
connective form class and amount learned was such that verbs pro- 
duced more learning than prepositions and prepositions produced more 
than conjunctions. The form of this relationship varied with materials 


and with grades. 


Surely, one of the important tasks 
of pedagogy is to create conditions 
that produce efficient learning. Two 
possible ways of accomplishing this 
task, at least for the kinds of learn- 
ing that conform to the paired-as- 
sociate paradigm, are close at hand. 
The first derives from evidence re- 
ported in a series of studies of the 
effects of sentence contexts on the 
learning of noun pairs. The learning 
materials used have consisted of small 
objects (Jensen & Rohwer, 1963a) y 
pictures of objects (Jensen & Rohwer, 
1903b, 1965) and printed names of 
objects (Rohwer, in press; Rohwer 
& Lynch, 1966, in press; Rohwer, 
Shuell, & Levin, in press). The typ- 
ical design of these studies contrasted 
the amount learned when the noun 


1 This work was supported, in part, by a 
contract with the United States Office of 
Education (OE6-10-273) through Project 
Literacy. The report was prepared at the 
Institute of Human Learning, which is sup- 
ported by grants from the National Science 
Foundation and the National Institutes of 
Health. 


pairs or object names were pre- 
sented in the context of a sentence 
uttered either by the experimenter 
(Z) or by the subject (S) or both, 
with the amount learned when no 
sentence context was provided. Uni- 
formly, the sentence conditions pro- 
duced more efficient learning by 
mentally retarded adults (Jensen & 
Rohwer, 1963a, 1963b), second-, 
fourth- and sixth-grade school chil- 
dren (Jensen & Rohwer, 1965), and 
fifth- and sixth-grade children (Roh- 
wer & Lynch, 1966). Using a more 
finely-graded series of conditions, 
Rohwer (1966) found that the pres- 
entation of noun pairs in the con- 
text of an English string in which 
the nouns were connected by a verb 
or a preposition (e.g, The running 
COW chases the bouncing BALL; 
The running COW behind the boune- 
ing BALL) produced more rapid 
learning than strings in which the 
pairs were connected by a conjunt- 
tion (The running COW and the 
bouncing BALL). The reported 85- 
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sociation between gradations in learn- - 
ing efficiency and the form class of - 


context connectives is in need of 
replication, but the general result is 
reliable: Sentence contexts facilitate 
the learning of noun pairs. 

A second method that appears to 
have potential for promoting learning 
efficiency is that of presenting ma- 
terials pictorially rather than in 
print. In connection with the question 
whether or not foreign-language 
words are learned more easily when 
associated with their environmental 
referents or when associated with 
native-language word. equivalents, 
Wimer and Lambert (1959) found 
that nonsense syllables were learned 
faster as responses to objects than as 
responses to the printed names of 
the objects. Since frequently it is 
diffieult to use many kinds of actual 
objects as learning materials, whether 
it be in the classroom or in the lab- 
oratory, it is important to know 
whether the demonstrated superiority 
of objects over words extends to pic- 
tures of objects as well. Furthermore, 
it remains to be shown whether or 
not differences in materials produce 
differences in learning efficiency when 
both the stimulus and the response 
members of paired associates are 
familiar. 

The purpose of the present experi- 
ment thus was twofold. First, to repli- 
cate the graded facilitation associated 
with connective form class found by 
Rohwer (1966) on both printed and 
pictorial materials and to deter- 
mine whether the increased learn- 
ing efficiency produced by sentence 
contexts can be further augmented by 
the use of pictorial materials. Finally, 
it was of interest to determine 
whether or not the predicted superi- 
ority of pictorial materials would 
decrease with age over an interval 
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during which verbal facility pre- 
sumably increases. 


METHOD 


Materials and Design 


The principal factors in a 2 X 2 X 4 fac- 
torial design were: grades (third vs. sixth); 
materials (printed vs. pictorial); and ver- 
balization (conjunction vs. preposition vs. 
verb vs. control). Both the printed materials 
and the pictorial materials were -presented 
by projection on a beaded screen. The pic- 
torial study-trial materials consisted of 16- 
mm, black and white movie film bearing the 
images of 24 pairs of objects that were pho- 
tographed against a neutral gray background 
and foreground for a total of 4 seconds each. 
The test-trial materials consisted of similar 
movie photographs of one object from each 
of the 24 pairs. The printed study-trial ma- 
terials were pairs of words (the names of the 
corresponding objects in the picture mate- 
rials) and the test-trial materials were the 
initial words from each pair photographed 
and mounted as 2 X 2-inch slide transpar- 
encies. A complete list of the 24 pairs of ob- 
jects/words is presented in Table 1. 

All verbalizations were uttered by E as he 
read from a prepared script. Five-word sen- 
tences of the form article-noun-verb-article- 
noun were constructed for each of the 24 
pairs of objects/words and constituted the 
materials for the verb conditions. Compa- 
rable strings for the preposition and con- 
junction conditions were derived from the 
Verb strings by substituting prepositions and 
conjunctions, respectively, for the verbs. Ver- 
balization for the control conditions simply 
consisted of E uttering the names of the two 
objects in each pair or reading the two words 
in each pair to S. A complete list of the 
scripts used by E is presented in Table 1. 


Subjects 


The sample consisted of 96 third- and 96 
sixth-grade children drawn from a school 
district in a middle-class area. Two grade- 
level populations were sampled in order to 
provide information as to the relative ef- 
fectiveness of picture and word materials 
for children of presumably lesser and greater 
verbal proficiency. A population younger 
than that in the third grade was avoided in 
view of the possibility that reading of the 
word materials might be problematical. The 
prediction was that the expected superiority 
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TABLE 1 
Srupy-Trtat MATERIALS 
Strings and objects/words Connectives 
1. The FORK the CAKE. or, on, cuts 
2. The TOWEL the PLATE. and, across, wipes 
3. The CAT the LOG. or, over, jumps 
4. The MAN the POLE. or, beside, bends 
5. The BAT the CUP. and, in, strikes 
6. The SHOE the CHAIR. and, beneath, taps 
7. The BOAT the BALL. and, against, rolls — ; 
8. The HAND the HAT. or, inside, throws 
9. The ROCK the BOTTLE. or, behind, breaks 
10. The CAR the WAGON. and, under, upsets 
11. The ROPE the EYE. and, around, rubs 
12. The NEEDLE the BALLOON. or, outside, pops 
13. The DOG the GATE. and, on, closes 
14. The SPOON the EGG. and, under, rolls 
15. The FIRE the BED. and, behind, burns 
16. The AX the WOOD. and, upon, hits 
17. The KNIFE the FLOWER. or, below, cuts 
18. The BLANKET the TREE. or, around, covers 
19. The MILK the BOWL. and, inside, fills 
20. The TEETH the APPLE. or, near, bite 
21. The HAMMER the BELL. or, over, pulls 
22. The PENCIL the PAPER. Or, across, tears 
23. The DOLL the BOOK. and, against, opens 
24. The FOOT the HOUSE. or, above, kicks 


of the pictorial over the printed materials successively at a 4-second rate, and, as each 
would be greater for the third- than for the pair appeared on the screen, Æ uttered the 
sixth-grade pupils. Within each grade, 12 Ss appropriate verbalization. After a 4-second 
were assigned randomly to each of the 8 ex- intertrial interval, the test stimuli were pre- 
perimental conditions. sented successively, again at a 4-second rate 
but in an order different from that of the 
Procedure rot trials, and, as each stimulus appeared, 
a aanle read or named it aloud. The same pró- 
b £53 "ies verum Ur eed cedure was followed on the second trial, ex- 
pote pt e gE OWO (Gene shal tho itoms were presented in dif- 
mplete trials. Since two male adults served ferent ord R ade orally 
as Es, the assignment of Ss was randomized F ‘dy Sead db R iiaii 
and balanced with respect to experimenters "P6 Were recorded by E. 
as well as with respect to experimental con- 
ditions. When an S entered the experimental RESULTS 
room, he was seated in front of the projector, 


at a distance of approximately 10 feet from The amount learned was measured , | 


the ae, ce slide and the movie pro- in terms of the total numbers of 

Jectors produced comparable levels of noise correct responses given on the two 

in the experimental room and the same pro- ; : 

jection screen was used with both. ^ test trials. Mean numbers of d 
The instructions informed 3 that he Tesponses are presented in Table 

would be shown 24 pairs of objects/words as a function of grades, materials, 

and that he was to learn them in such away and verbalization. In the four-way 


that he could produce the name of the sec- lysi i d on 
ond member of each pai analysis of VoM performe 
first. One example. ee qan. ci these data, the main effect for grades 


which the first study trial commenced. Dur- Was not significant (F = 1.41, df = 
ing the study trial, the pairs were presented 1/160, p > .05), nor was the expected 
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TABLE 2 


MEAN NUMBERS OF Correct RESPONSES AS A FUNCTION OF GRADE, 
MATERIAL, AND VERBALIZATION 


Material 

Grade Conjunction 
Printed 3 14.24 
6 13.34 
Printed total 13.80 
Pictorial 3 23.76 
Á 6 30.08 
Pictorial total 26.92 
Total group 20.36 
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Verbalization 
Preposition Verb Control Total 
19.24 24.08 15.42 18.25 
23.16 22.50 19.92 19.73 
21.20 23.30 17.66 18.99 
28.58 33.16 29.08 28.64 
25.66 33.76 27.76 29.31 
27.12 33.46 28.42 28.98 
24.16 28.38 23.04 23.98 


Note.—MSE(160) = 19.59. 


interaction of Grades x Materials 
(F « 1). Clearly, however, learning 
was more efficient with pictorial than 
with printed materials (F — 122.24, 
df = 1/160, p « .01) in both grades, 
so much so that more than 32% of 
the total variance was associated 
with this factor. 

The question whether or not the 
connective form class effect reported 
by Rohwer (1966) is replicable re- 
ceives an affirmative answer in the 
present results. The main effect of 
verbalization was significant (F — 
13.63, df = 3/160, p « .01) and, of 
the three string conditions, only the 
Verb was superior to the control, 
confirming the results of previous ex- 
periments in which the presentation 
of paired nouns in sentence contexts 
increased learning efficiency. As for 
differences associated with connective 
form class, more was learned with 
verb connectives than with preposi- 
tions which, in turn, were superior 
to conjunction? The form of the 
relationship between connective form 
class and learning efficiency, how- 
ever, appears to depend upon whether 
the materials are words or pictures 
and upon the grade level of Ss. The 

* All post hoc comparisons were tested by 
means of the Schefié method at p = .05. 


interaction of Materials x Verbaliza- 
tion was significant (F = 2.76, df 
= 3/160, p < .05), such that for 
printed noun pairs, the difference be- 
tween the verb and preposition con- 
nectives was not significant but each 
was superior to conjunctions. This 
outcome is consistent with the results 
obtained by Rohwer (1966) with 
printed materials. In contrast, verb 
connectives in the pictorial conditions 
were superior to both prepositions 
and conjunctions which did not differ 
significantly from one another. 

‘An examination of the three-way 
interaction, Grades x Materials X 
Verbalization (F — 3.45, df — 3/160, 
p < .05) revealed that the interaction 
Materials x Verbalization already 
described was located entirely in the 
sixth-grade samples (F — 6.21, df = 
3/160, p < .01) and not in the third- 
grade samples (F < 1). For the 
latter Ss the effects of connective 
form class were virtually parallel in 
the printed and pictorial conditions; 
in both cases, the difference between 
verbs and prepositions was as large 
as the difference between prepositions 
and conjunctions. 

The main effect of experimenters 
was not significant (F < 1) nor were 
any of the interactions involving this 
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factor, with the exception of Experi- 
menters X Materials X Verbalization 
(F — 2991, df — 3/160, p « .05). An 
appraisal of this interaction indicated 
that the direction of the differences 
obtained was consistent across Es 
but the magnitudes varied, that is, 
pictures were superior to words and 
verb connectives were associated with 
the greatest amount of learning for 
both Es. 

In sum, the presentation of paired 
associates in pietorial form and in the 
context of sentences produces more 
efficient learning than any other com- 
bination of conditions examined, and 
the form class of context connectives 
is consistently associated with the 
amount learned, although the de- 
tailed form of that relationship ap- 
pears to depend on the grade level 
of Ss and on the character of the 
learning materials. 


Discussion 


With respect to the observed dif- 
ferences associated with type of 
verbalization, it is important to ex- 
amine an alternative interpretation 
to that offered here. A careful in- 
spection of the materials presented in 
Table 1 reveals that the three kinds 
of contextual strings differ in two 
ways rather than in only one. As 
previously noted, one of the differ- 
ences is in the form class of the con- 
text connectives; the second differ- 
ence lies in the number of unique 
words that served as connectives 
within each of the form classes, The 
set of connectives in the Verb, Prep- 
osition, and Conjunction conditions 
were comprised of 22, 16, and 2 
unique words, respectively. If it is 
assumed that a contextual string 
Connective constitutes a part of the 
stimulus term in each pair, then the 
three form-class conditions differ 
markedly in intralist similarity. In- 
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deed, the variation in intralist simi- 
larity, by itself, justifies the pre- 
dietion that the amount learned 
should be greatest in the Verb con- 
dition, least in the Conjunction con- 
dition, and intermediate in the Prepo- 
sition condition. The interpretation 
of the form-class effect in terms of 
intralist similarity, however, has al- 
ready been falsified. In a previous 
experiment, Rohwer and Lynch (in 
press) systematically varied the num- 
ber of different words used in each 
form-class condition and found that 
the superiority of verb strings over 
conjunction strings was as great 
when only two different verbs were 
used as it was when eight different 
verbs were used. Thus it is warranted 
to conclude that the differences in 
performance observed in the present 
experiment are associated with dif- 
ferences in the form class of the 
context connective and not with dif- 
ferences in intralist similarity. 

The present demonstration of the 
superiority of pictorial over printed 
word materials for the promotion of 
efficient  paired-associates learning 
raises two kinds of additional ques- 
tions. The first concerns the scope, 
the possibility, and the manner of 
applying these results to the prob- 
lem of presenting materials for learn- 
ing in school settings. The most 
important restriction on the scope of 
application of the present results is 
that research to date warrants gen- 
eralization only to those kinds of 
school learning tasks that are iso- 
morphie with the paired-associates 
paradigm. Runquist and Hutt (1961), 
for example, report that high-school 
Students learn verbal concepts more 
rapidly when the materials are repre- 
sented verbally than when they are 
represented pictorially. The question 
of the possibility of applying the 
present results to school learning 
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requires an answer that is sensitive 
to practical as well as to scientific 
constraints. The most direct implica- 
tion of the demonstrable superiority 
of the pictorial mode is that relevant 
eurricular materials presently availa- 
ble in printed form should be con- 
verted to a pictorial form. Although 
such a conversion would not be im- 
possible, considerable resistance might 
well be expected. Alternatively, it is 
of interest to consider the suggestion 
that learners themselves be trained to 
make covert pictorial responses to 
printed materials. Such a program 
would not only avoid the difficulties 
involved in reconstructing current cur- 
rieular materials, it would also better 
equip the learner to engage in efficient 
acquisition regardless of the character 
of the content he is asked to learn. A 
decision as to the feasibility of such a 
program awaits empirical evaluation. 
The second kind of question is 
directed at the issue of choosing an 
explanation for the differences ob- 
served. Data reported by Wimer and 
Lambert (1959) suggest that, for 
college-age Ss, the greater difficulty 
Presented by the task of learning 
word-trigram than by that of learn- 
ing object-trigram pairs is due to the 
greater intralist similarity among the 
former and not to differences in 
meaningfulness. Nevertheless, for Ss 
as young as those who participated 
in the present study, word and pic- 
ture stimuli may differ in meaningful- 
Dess as well as in intralist similarity. 
The facilitatory effect of presenting 
Pairs in the context of sentences 
appears quite robust across differ- 
ences in populations and differences 
in materials. Again, however, the 
task of explanation remains to be 
accomplished, and a clarification of 
the facts as to the effects of connec- 
tive form class is relevant to this 
task. The striking differences ob- 
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served between the third- and sixth- 
grade samples in the way contextual 
strings affected the learning of pic- 
torial and printed pairs invites 
speculation. In the third-grade sam- 
ples, the relationship between connec- 
tive form class and amount learned 
was markedly linear and consistent 
across materials, as if each of the 
kinds of learning aids provided 
added a constant increment to per- 
formance. For sixth-grade Ss, how- 
ever, the preposition connectives ap- 
peared to be functionally equivalent 
to the verbs in producing superior 
learning of the word pairs and pic- 
torial presentation, with only a con- 
junetion string context, seemed suf- 
ficient to promote learning as efficient 
as that produced by preposition 
strings in the third-grade samples. 
Loosely speaking, it was as if the 
sixth-graders’ threshold for engaging 
in facilitatory processes was lower than 
that of the third-graders. If this is 
true, it suggests again the possibility 
that children might be trained not 
only to make covert pictorial re- 
sponses to printed materials but also 
to cast disparate learning elements 
in the mold of sentential structure. 
Regardless of the substantive worth 
of such speculations, they indicate 
the considerable amount of explica- 
tion that remains to be accomplished 
with regard to the present phenomena 
of efficient learning. 
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DOGMATISM AND EXAMINATION PERFORMANCE 


B. JACK WHITE 4x» RICHARD D. ALTER 
University of Utah 


Whereas Ehrlich found significant negative r's between scores on the 
Dogmatism Scale and scores on the true-false sociology test, Christen- 
sen found no significant r's between dogmatism and scores on multiple- 
choice or essay tests in psychology. Since dogmatic Ss tend to give a 
disproportionate number of “true” responses on true-false tests, differ- 
ences in examination formats used in the 2 studies could account 
for the disparity in results; however, in the present study, & compari- 
son of correlations between dogmatism and number of correct re- 
sponses to true-false items (r = —.14) with the correlation for multiple- 
choice items (r — —.16) produced no support for this notion. Although 
the disparity might in part be due to differences in course content 
and approach to the subject matter in sociology and psychology, the 
variability in r's from 14 psychology classes in the present study in- 
dicates that sampling errors made a sizable contribution to the dis- 
parity. The weighted average r for the 14 classes was small (—.18; p 
< 01) and the magnitude of the r's was variable; nevertheless, the 
sign of the r's was relatively consistent (12 of 14 classes yielded nega- 


tive r’s; p = .006). 


At a common-sense level, one might 
expect “closed-mindedness” to inter- 
fere with performance on examina- 
tions in college courses, especially in 
courses where popular beliefs are 
likely to be contradicted. Ehrlich 
found support for this notion in two 
studies (1961a, 1961b) of the rela- 
tionship between dogmatism, as 
measured by Rokeach’s (1960) Dog- 
matism (D) Scale, and performance 
on examinations in introductory so- 
ciology. In the first study (Ehrlich, 
1961a), 100 students completed the 
D Scale, took the same 40-item true- 
false examination twice during the 
academic quarter, then received the 
examination again by mail 5 months 
after the end of the quarter. The cor- 
relations between D scores and ex- 
amination scores for the 57 students 
who returned the third test were 
—.30, —.52, and —.54 for the three 
Tespective tests. Ehrlich's second 
study (1961b) was conducted 5 years 
later as a follow-up. The D Scale and 
the examination were mailed to the 
90 subjects (Ss) in the original sam- 


ple who could be located and 65 of 
these were returned. The correlations 
of the resulting D scores with ex- 
amination scores obtained during the 
course 5 years earlier were —.32 
and —.29, respectively; the correla- 
tion between the most recent D 
scores and examination scores was 
— 43. 

Despite the fact that all of the cor- 
relations were significant and in the 
expected direction in both studies, 
some additional data in the second 
study raises a question about the gen- 
erality of the findings; namely, the 
correlation between D scores and 
overall grade averages (cumulative 
point-hour ratio) in college was only 
—.09. The fact that this correlation 
was so much smaller than the correla- 
tions between D scores and sociology 
examination scores led Ehrlich (1961b, 
p. 285) to conclude that "it seems 
quite likely that course content repre- 
sents the significant source of varia- 
tion." M 

Christensen (1963) drew similar 
conclusions from his study of the re- 
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lationship between dogmatism and 
grades in introductory psychology 
classes. The correlations between D 
Scores and scores on a multiple-choice 
examination (r for 49 men — .04; r 
for 117 women = -—.11) and be- 
tween D scores and scores on an essay 
examination (r for men — .16; r for 
women — —.08) were small and in- 
signifieant—findings which led Chris- 
tensen to speculate that perhaps the 
differences between the results of his 
study and Ehrlich's (1961a) were due 
to differences in course content and 
approach to the subject matter in 
psyehology and sociology. More re- 
cently, Costin (1965) entertained the 
“course content" hypothesis as one 
possible explanation for the discrep- 
ancy between his results in intro- 
ductory psychology and  Ehrlich's 
(1961a). Costin found no significant 
relationship between dogmatism and 
multiple-choice examination scores, 
either on a precourse test (N = 67, 
T = —.23) or on a postcourse test 
(N = 67, r = —.19). To the writers’ 
knowledge, the only study which has 
reported a significant relationship be- 
tween dogmatism and grades in in- 
troductory psychology is one by 
Zagona and Zucher (1965) wherein 
& correlation of —.20 (N = 517; p 
< .001) was obtained. 

One of the purposes of the present 
study was to investigate an alterna- 
tive to the course content hypothesis 
mentioned above. Since dogmatic in- 
dividuals evidence a response set to 
agree with items about which they 
feel uncertain (Peabody, 1961), there 
is a possibility that Ehrlich’s (1961a, 
1961b) correlations between dog- 
matism and sociology grades were 
heavily influenced by a tendency for 
dogmatic Ss to make a dispropor- 
tionately large number of “true” re- 
sponses on a true-false examination 
and, thus, receive poor scores on 
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the examination—a tendency which 
would not be reflected in Christen- 
sen’s (1963) study since he used 
multiple-choice and essay examina- 
tions, or in Costin’s (1965) study be- 
cause he used multiple-choice items, 
With the foregoing in mind, assess- 
ment of the effects of examination 
format on the relationship between 
dogmatism and grades was under- 
taken in the present study. 

The second, and more important, 
purpose of the present study was to 
obtain correlations between dogma- 
tism and grades for a number of 
samples of Ss. The seven correlation 
coefficients reported above in connec- 
tion with Ehrlich’s studies (1961a, 
1961b) were obtained at various 
times from various members of one 
college course, the four reported for 
Christensen (1963) were obtained 
from one sample, the two reported 
for Costin (1965) were based on one 
sample, and the one reported for 
Zagona and Zucher (1965) was, of 
course, based on one sample, Of the 
14 correlations which have been men- 
tioned, 12 were negative, a fact which 
hints that there is some consistency 
in the direction of the relationship, 
even if there is considerable variabil- 
ity in its magnitude. The present 
study provides additional informa- 
tion on the consistency of the direc- 
tion and magnitude of the relation- 
ship between dogmatism and grades 
in a number of psychology classes 
taught by the same instructor and in 
a number of classes taught by different 
instructors. 


MzTHOD 


During the period 1963-1965 the D Scale 
was administered to 2099 students in seven 
instructors’ introductory psychology classes 
at the University of Utah, Fourteen classes 
were involved (one instructor taught five 
classes, another taught four) which mR 
in size from 33-319 students. Ten of t 
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TABLE 1 
NUMBER or Supsects, MEAN D Scorzs, AND CORRELATION BETWEEN 
D Scores AND Examination SCORES 


Number of subjects Mean D score 7 of D score & examination score 
Instructor 
Class | Men |Women| Class Men | Women Class Men Women 
A 75 38 37 | 148.3 | 152.5 | 144.0 | —.28* .07 — 424" 
A 288 | 120} 168 | 149.3 | 151.6 | 147.7 | —.16* | —.09 = .21** 
A 319 | 171| 148 | 153.1 | 155.2 | 150.7 | —.13* | —.20* | —.07 
A 290 | 143 | 147 | 149.4 | 153.9 | 145.0 | —.32** | —.52** | —.23** 
nu 46 25 21 | 144.3 | 143.0 | 145.9 | —.20 —.32 —.02 
B 246 96 | 150 | 146.8 | 150.0 | 144.8 | —.20** | —.34** | —.09 
B 247 | 152 95 | 149.8 | 150.3 | 147.8 | —.24** | —.20* | —.26* 
Bs 52 29 23 | 148.8 | 147.8 | 150.0 | —.09 —.04 —.14 
B 187 82] 105 | 153.1 | 159.0 | 148.6 | —.17* | —.29** | —.12 
Cc 78 37 41 | 147.0 | 152.7 | 141.9 .08 13 —.11 
D* 33 21 12 | 140.3 | 147.4 | 128.0 | —.18 Ce 7.16 
E 87 58 29 | 144.3 | 145.6 | 141.8 .07 .07 .08 
Fe 74 39 35 | 146.4 | 149.7 | 128.5 | —.09 =I —.08 
G 77 45 32 | 152.2 | 156.2 | 146.3 | —.04 Ex .20 
"Totals 2099 | 1056 | 1043 | 149.4 | 152.2 | 146.0 | —.18** | —.21** | —.15** 
a Night class. 
*p:«».05. 
"tips 101: 


classes consisted of regular daytime stu- 
dents, most of whom were freshmen and 
sophomores. The four other classes were 
taught at night and were comprised of both 
working adults and daytime students in a 
ratio of approximately 2:1. 

One instructor (referred to later as In- 
structor “B”) used both true-false and mul- 
tiple-choice questions in his examinations 
whereas all other instructors used only mul- 
tiple-choice questions. The text and the 
examination questions which were used were 
the option of the instructor; however, all 
of the texts were of approximately equal 
caliber and most of the questions came from 
ey instructor’s manual provided with the 

The D Scales were administered during 
^ class period 3-5 weeks before the end of 
the quarter. At the end of the quarter the 
ae number of questions correctly answered 
Y each student on the various examina- 
tions was obtained and the resulting total 
Scores were correlated with D scores. 


RzesunTS 
As can be seen in Table 1, the 
Statistically significant correlations 
between D scores and examination 


scores tend to come from the larger 
classes, as would be expected. Al- 
though the correlations for the 14 
classes are generally small (weighted 
average r = —.18; p < .01), there 
is considerable consistency in the sign 
of the correlations, According to 
binomial tests, the fact that 12 of 
the 14 classes yielded negative cor- 
relations is significant (p = .006), 
as is the fact that women in 12 of 
the 14 classes yielded negative cor- 
relations (p = .006) and the fact that 
men in 11 of the classes did likewise 
(p = .029). 

The correlations in Table 1 show a 
good deal of variability in the re- 
lationships between D scores and ex- 
amination scores, both between 
classes taught by the same instructor 
(where course content and approach 
are likely to be more or less constant) 
and between classes taught by dif- 
ferent instructors (where, of course, 
both course content and approach are 
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likely to vary). While the range of 
correlations for the 14 classes was not 
especially large (—.32 to .07), the 
range for men (—.52 to .13) and the 
range for women (—.42 to .20) was 
rather large. In a speculative vein, it 
is interesting to note that if the men 
and the women who produced the 
largest negative correlations in Table 
1 had happened to be in the same 
class, the weighted average correla- 
tion would have been —.50. Had 
similar circumstances obtained for 
the men and women who produced 
the largest positive correlations in 
Table 1, the weighted average cor- 
relation would have been .16. Under 
these hypothetical circumstances, the 
resulting correlations would have ap- 
proximated the range of the most 
disparate of the 11 correlations re- 
ported by Ehrlich (1961a) and Chris- 
tensen (1963); namely, —.54 (Ehr- 
lich) to .16 (Christensen). 


Dogmatism, Examination Format, 
and Examination Scores 


As indicated earlier, Instructor B 
in Table 1 used both true-false and 
multiple-choice questions on each of 
his examinations. During the aca- 
demic quarter, the 187 students in the 
fourth class listed for Instructor B 
in Table 1 took five examinations 
which contained a total of 347 ques- 
tions, of which 118 (34%) were true- 
false and 229 (66%) were multiple- 
choice. The correct answer for 44% 
of the true-false items was “true.” 

The correlation between D scores 
and the number of questions answered 
with “true” responses was .23 (p < 
.01). Despite this, performance on the 
true-false items was no worse than 
performance on the multiple-choice 
items—the correlation between D 
Scores and number of correct re- 
sponses to the true-false items was 
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—.14 and the correlation for the 
multiple-choice items was  —.16. 
"Thus, the fact that Ehrlich used true- 
false items does not appear to ac- 
count for the differences between his 
results and those of Christensen or 
Costin. 


Discussion 


The results of the present study in- 
dicate that it is unlikely that the dif- 
ferences between Ehrlich’s (1961a, 
1961b) results and those of Christen- 
sen (1963) or Costin (1965) were 
due to differences in examination for- 
mat. While variations in course con- 
tent may account for some propor- 
tion of the differences among the 
studies just mentioned, the variabil- 
ity in the correlations obtained in the 
present study suggests that sampling 
errors also made a sizable contribu- 
tion to the differences. 

Despite the fact that the correla- 
tions between D scores and examina- 
tion scores were rather consistently 
negative in the present study, the 
weighted average correlation was 
small and the variability in the mag- 
nitude of the correlations was rather 
large. Thus, it seems fair to say that 
the predictive power of the D Scale 
with regard to grades is not impres- 
sive. 


REFERENCES 


Curistensen, C. M. A note on “Dogmatism 
and Learning." Journal of Abnormal and 
Social Psychology, 1963, 66, 75-76. 

Costin, F. Dogmatism and learning: A Os 
low-up of contradictory findings. Journa 
of Educational Research, 1965, 59, 186- 
188. " 

Exrucu, H. J. Dogmatism and Jam 
Journal of Abnormal and Social Psycho 
ogy, 1961, 62, 148-149. (a) EA 

EnnucH, H. J. Dogmatism and Jeane 
five-year follow-up. Psychological Reports, 
1961, 9, 283-286. (b) " 

Pzasopr, D. Attitude content and agreemen 


DOGMATISM AND EXAMINATION PERFORMANCE 289 


set in scales of Authoritarianism, Dogma- ^ ZaaoNa, S. V., & Zvcnrm, L. A. Ja. The re- 
tism, Anti-Semitism, and Economie Con- lationship of verbal ability and other 
servatism. Journal of Abnormal and So- cognitive variables to the open-closed 


cial Psychology, 1961, 63, 1-11. cognitive dimension. Journal of Psychol- 
Roxeacu, M. The open and closed mind. ogy, 1965, 60, 213-219. 
New York: Basic Books, 1960. (Received October 11, 1966) 


Journal of Educatienal Psychology 
1967, Vol. 58, No. 5, 290-302 


A TWENTY-COLLEGE STUDY OF STUDENT x COLLEGE 
INTERACTION USING TAPE (TRANSACTIONAL 
ANALYSIS OF PERSONALITY AND ENVIRONMENT): 


RATIONALE, RELIABILITY, AND VALIDITY? 


LAWRENCE A. PERVIN 
Princeton University 


A study of college characteristics and Student X College interaction 
using Transactional Analysis of Personality and Environment 
(TAPE), an instrument based on the semantic differential. 3,016 
students from 21 colleges rated the following concepts on the 52 scales 
in Form A or the 52 scales in Form B: My College, My Self, Stu- 
dents, Faculty, Administration, Ideal College. Ratings of satisfaction 
with aspects of college life were also made on 16 scales. Data relevant 
to 4 areas are presented: (a) TAPE as a measuring device for intra- 
and interinstitutional research; (b) the relationship between concept 
discrepancy scores and satisfaction ratings; (c) reliability of TAPE; 
(d) factorial structure of TAPE. In general, discrepancies between 
Student perceptions of themselves and their college were found to be 
related to dissatisfaction with college. The data were interpreted as 
supporting the theoretical model of Student X College interaction 
and the utility of TAPE in this area of research. This research should 
be useful in suggesting the transactions within the college, or between 
students and parts of the college, that might be influenced in the di- 


rection of fostering student development. 


Social science research has often 
attempted to account for behavior in 
terms of the individual or the en- 
vironment. This dichotomy is seen in 
controversies between sociology and 
psychology, social psychology and 
clinical psychology, S-R theory and 
psychoanalysis, and in the nature- 
nurture controversy (Hunt, 1965). 
The theoretical rationale for the re- 
search reported here and for the 
development of the Transactional 
Analysis of Personality and Environ- 
ment (TAPE) questionnaire is that 
human behavior can be best under- 
stood in terms of the interactions or 
transactions (Dewey & Bentley, 1949) 
between the individual and his en- 
vironment. This view has been most 


*This research was Supported b; ts 
from the Straus Council on Hia Rela. 
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Office of Education. 


clearly expressed in the need-press 
personality theory of Murray (1938) 
though it has also received attention 
in, among others, the theoretical 
systems of Lewin (1951) and Heider 
(1958). 

The interaction approach has been 
found to be useful in areas outside 
the academic setting such as in- 
terpersonal attraction (Lott & Lott, 
1965; Miller, 1963; Newcomb, 1956), 
occupational choice and satisfaction 
(Super, 1963), adaptation to cultural 
patterns (Jahoda, 1961), and psy- 
chopathology (French, 1963; Kelly, 
1966; Spiegel, 1957). Within the aca- 
demic setting, performance has been 
related to an interaction between stu- 
dent personality and demands of the 
curriculum (Malleson, 1959; Snyder, 
1966), situational stress (Smith E 
Rockett, 1958), instructor personality 
(McKeachie, 1961), and type of exam 
(Claunch, 1964). Research on col- 
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lege dropouts suggests the importance 
of a match or fit between student 
and college (Pervin & Rubin, in press; 
Stern, 1962; Summerskill, 1962). 

While instruments have been de- 
veloped to measure the college en- 
vironment (Astin, 1963; Pace, 1963; 
Stern, 1963), they do not provide for 
an analysis in terms of Individual X 
Environment interaction. The Activi- 
ties Index (AI) and College Char- 
acteristics Index (CCI) were de- 
veloped to follow Murray’s need- 
press system, but they have not been 
used in this way and it is not clear 
that the need and press scales on the 
instruments are comparable, Also, 
while the AI and CCI include items 
relevant to various parts of a col- 
lege environment (students, faculty, 
administration), analyses in terms of 
the interactions or transactions 
among these parts are not generally 
reported. 

„As Hunt (1965) notes, the semantic 
differential represents “an important 
method of assessing the interaction 
between people and situations [p. 
83].” The research reported here de- 
scribes the current status of TAPE, 
an instrument which uses the se- 
mantic differential technique to study 
the various interactions and transac- 
tions that occur within a college en- 
vironment, and their relevance to 
institutional strain and student satis- 
faction. In sum, this report discusses 
TAPE data in relation to the three 
kinds of analysis for which it was de- 
veloped: (a) comparisons of different 
college environments; (b) analysis 
of sources of conflict or strain within 
à college environment and compari- 
Sons of these sources across colleges; 
and (c) the analysis of individual 
Performance and satisfaction as à 
function of Student x College inter- 
action. “The organism which adapts 
Well under one condition would not 
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survive under another. If for each 
environment there is a best organism, 
for every organism there is a best 


environment [Cronbach, 1957, p. 
679]." 

MzrHoD 
Subjects 


Participants in this study were 3,016 
students from 21 colleges? For Form A 
of TAPE, there were 1,393 subjects (Ss) 
from 11 public and 10 private colleges while 
for Form B there were 1,623 Ss from 11 
publie and 9 private colleges. The colleges 
in the sample varied in geographical loca- 
tion, size, and male-female ratio. Also, there 
was an attempt to select colleges with dif- 
ferent campus atmospheres. 

The selection of Ss differed at the various 
participating colleges. Generally the sample 
consisted of students taking an introductory 
psychology course. In one case volunteers 
were used and in a few there was a random 
sampling of students in each of the under- 
graduate years. In all cases the students 
were not paid and, except for two colleges, 
participation was anonymous. Data were 
collected in the spring of 1966. 


Materials and Procedure 


TAPE is based on the semantic dif- 
ferential technique and asks students to 
rate a number of concepts on the same 
polar adjective scales. In its standard form 
TAPE requires that the following concepts 
be judged on 52 scales: College, Self, Stu- 
dents, Faculty, ‘Administration, Ideal Col- 
lege. Concepts relevant to the college refer 
to the one the student is attending. In t 

1l-point scales were used as opposer 
bo sare traditional 7-point scales. This 
followed Gulliksen’s (1958) suggestion that 
Ss are capable of making finer discrimi- 
nations than are generally allowed for on 
the semantic differential. 

Forms A and B follow the same format 
but contain different scales. The scales 
used in these forms were developed on an 


Participating colleges and universities 
were: Antioch, Brooklyn, Bucknell, Cincin- 
nati, Dartmouth, Georgetown, 
Haverford, Kent, Maryland, Middlebury, 
New Mexico, North Carolina, Pennsylvania, 
Princeton, Smith, South Dakota, Stony 
Brook, Tennessee, Texas, Wesleyan. 
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a priori basis as to which dimensions might 
be important in assessing Individual X 
Environment interaction, particularly Stu- 
dent X College interaction. Modifications 
were then made on the basis of a 10-college 
exploratory study. Scales were chosen on 
the basis of whether they discriminated 
among colleges and between concepts; that 
is, whether a distribution of responses was 
obtained across subjects (colleges) or con- 
cepts. Examples of the polar adjective scales 
are: authoritarian-democratic, grinding-fun- 
loving, religious-secular, idealistic-material- 
istic, equalitarian-status-oriented. 

The TAPE questionnaire included a 
beginning page on which students gave 
information such as college class, sex, area 
of concentration and whether a resident or 
a commuter, and also included 16 questions 
relating to satisfaction with the college 
environment. The latter appeared in the 
middle of the questionnaire between the 
Students and Faculty concept ratings. 
Ratings were made on 11-point scales with 
the extremes being defined for Ss. About 
45 minutes was required for a student to 
fill in the background information, rate 
each of the six concepts on 52 scales, and 
complete the 16 satisfaction items. 


RESULTS 


The results are presented in rela- 
tion to the three purposes for which 
the instrument was developed: intra- 
institutional research, interinstitu- 
tional research and Student x Col- 
lege interaction. Data relevant to the 
first two goals are largely descrip- 
tive, whereas the data in relation to 
the third are more easily interpreted 
in terms of statistical tests of sig- 
nificance. Data relevant to the re- 
liability and factor structure of 
TAPE are also presented. 


Intra- and Interinstitutional Com- 
parisons: Descriptive Properties of 
TAPE 


Initial intrainstitutional analysis 
of TAPE data consists of looking at 
mean scale ratings on each concept 
and comparisons of means for a 
scale across concepts; that is, how do 
students at a college perceive them- 


selves and different parts of their col- 
lege environment? Interinstitutional 
comparisons include those of scale 
means for different colleges on single 
concepts and of the relative distri- 
bution of means for a single scale 
across different concepts; that is, do 
students at various colleges perceive 
different characteristics as associated 
with individual parts of the college 
environment (students, faculty, ad- 
ministration, etc.) and in the pattern 
of characteristics associated with 
these parts? Examples of these kinds 
of comparisons are given in Figure 1, 
where mean ratings on the concepts 
are plotted for three colleges on each 
of two scales. 

The data presented in Figure 1 il- 
lustrate a number of conclusions rel- 
evant to TAPE: (a) There is con- 
siderable variability in scale ratings 
across concepts for a single college. 
For example, on the first scale il- 
lustrated it can be seen that for Col- 
lege 1 the college and administration 
are rated as quite conservative, the 
self and students less so, and the 
ideal college as quite liberal. (b) 
There is considerable variability 
across colleges in scale ratings on 
each concept and, perhaps more im- 
portantly, in the pattern of ratings 
for a scale across the six concepts. For 
example, in the second scale in Fig- 
ure 1 the means for Schools 1 and 
2 are quite similar except for the 
student concept. The absolute values 
for Schools 2 and 3 are quite differ- 
ent but, with the exception of ideal 
college, the pattern of means is quite 
similar. On some scales almost all of 
the schools show a similar pattern 
whereas on other scales almost every 
possible pattern occurs. (c) The vari- 
ability in pattern of mean ratings 
indicates that there can exist large 
discrepancies between the way stu- 
dents see various parts of the college. 


~ oe 
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Scale #6A: Conservative-Liberal 


College 


Self 


Students 


Faculty 


Administration 


Ideal College 


College 


Self 


Students 


Faculty 


Administration 


Ideal College 


Fia. 1. Concept means for three co! 


gsm. noteworthy is the fact 
on e way students see themselves 
a be at variance with the way they 
m udents at the college in general 
Psy 2, Scale 3B). It can be sug- 
e that large discrepancies sug- 
hio Sources of strain in the func- 
Pe ng of the system. Different scales 
ies concepts are relevant to different 

eges in relation to the strain is- 


Ofractical 


lleges on two TAPE scales. 


sue. (d) The mean rating for college 
can be more extreme than the mean 
ratings for the parte of the college; 
the whole can be greater than or dif- 
ferent from the sum of its parts 
(College 2, Scale 6A). 

As an indication of the variability 
in ratings across colleges, means for 
the colleges were computed for each 
scale on the six concepts. Then, for 
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TABLE 1 
CHARACTERISTICS or Scaues ror TAPE 
CoNczPTs 

Form A Form B 

Mean lean 

Concent ge | Mean inge | Mean 

SD of 52 | of 20 | SD of 52 

scales | college les 
s 

4.4 147 44 1.11 

2.4 .62 2.5 +64 

3.5 .86 3.7 -89 

2.5 165 3.2 E 

3.2 -80 3.3 ;B0 

Ideal College 2.6 E 3.1 P in 


each scale a range of means for the 
colleges and the standard deviation 
of the means was computed. The re- 
sult was a range of means and 
standard deviations for the college 
means for each scale on the six con- 
cepts, The mean scale range and 
scale standard deviation (across 52 
scales) are presented for the six con- 
cepts in Table 1. These data clearly 
indieate that the greatest variability 
occurs in relation to the college con- 
cept. On the average, students at dif- 
ferent colleges tend to agree on the 
way they see themselves and their 
ideal college. 


Student X College Interaction: Va- 
lidity 

The major test of the validity of 
TAPE as a measure of Individual x 
Environment interaction consists of 
the relationship between concept dis- 
crepancy scores and satisfaction rat- 
ings. For each S a discrepancy score 
was calculated for each pair of con- 
cept ratings (N — 15). A discrepancy 
score represented the sum of the 
absolute difference in ratings of two 
concepts on 52 scales. For each 
TAPE form correlations were com- 
puted between the concept discrepancy 
scores and the satisfaction ratings 
completed in the middle of the ques- 
tionnaire—a 15 X 16 correlation ma- 


trix. It was predicted that a high dis- 
erepancy score would be related to 
dissatisfaction and that this would 
hold more for nonacademic satisfac- 
tion than for academic satisfaction. 
Furthermore, it was predicted that 
certain discrepancy scores should be 
most significantly related to some 
satisfaction variables than to others. 
For example, Selí-Students diserep- 
ancies should most clearly relate to 
reports of feeling uncomfortable with 
students, Self-Faculty discrepancies 
to reports of dissatisfaction with the 
faculty, and Self-Administration dis- 
crepancies to reports of disagreement 
with the administration. 

Space does not permit the presenta- 
tion of all correlation matrices for all 
schools, Therefore, data most rele- 
vant to the validity issue and the 
above predictions will be presented. 
Summary data are presented in Table 
2 which indicate the characteristics 
(mean, median, range, number sig- 
nificant) of the correlations between 
Self-College discrepancy scores and 
16 satisfaction variables. These are 
presented for public and private col- 
leges on Forms A and B. The cor- 
relations are presented so that 4 
positive correlation always means 4 
positive relationship between diserep- 
ancy scores and dissatisfaction rat- 
ings. Correlations significant beyond 
the .01 and .001 levels are included in 
the number significant (p < .05) 
category. The maximum number of 
significant correlations would be the 
same as the number of colleges in the 
sample. 

The data in Table 2 give clear sup- 
port to the predictions. While the 
range of correlations indicates some 
variability in the stability of the re- 
lationships across schools and items, 
the general trend is clearly in the 
direction of a relationship betwee? 
high Self-College discrepancy and dis- 
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TABLE 3 


MEAN AND MEDIAN CORRELATIONS BETWEEN THREE Discrepancy SCORES AND THREE | 
SATISFACTION VARIABLES FOR PUBLIC AND PRIVATE COLLEGES 


Tape Form A. Tape Form B 

Pics OR Querion Public Private Public Private 
M Màn M Man M Mdn M Màn 
Self-Student 5 .30 .32 37 .39 32 .29 39 .40 
Nen. 6 18 .21 18 7 25 .28 29 .30 
7 19 17 19 +21 27 .23 31 .38 
Self-Facult; 5 Bt] .09 17 .19 .19 16 .23 24 
Kis 6 .23 30 .35 .37 .33 38 ET AT 
7 18 .26 27 27 .25 24 .92 .94 
Self-Admin- 5 .08 .06 19 48 5 .16 .20 .26 
istration 6 .08 ll +22 .24 .30 .35 .30 .28 
7 .81 .38 44 42 ES .36 .49 49 


SS ee ooo 
Note.—For TAPE Form A, N = 11 public, 10 private; for TAPE Form B, N = 11 public, 


9 private. 


satisfaction. The relationships are 
relatively stable across the two forms, 
though there is some tendency for 
the public college correlations to be 
higher on Form B than on Form A. 
Since dropping out of college is a 
much more complex phenomenon 
than satisfaction, it is not surprising 
that correlations for the former are 
generally lower than those for the 
latter. Though there are some excep- 
tions, the correlations relating to aca- 
demie dropout and satisfaction are 
generally lower than those for non- 
academic dropout and satisfaction. 
While not presented here, the cor- 
relations between College-Ideal Col- 
lege discrepancy scores and satisfac- 
tion ratings are very similar to those 
for Self-College discrepancy scores 
and satisfaction, though generally 
they are slightly higher. It was pre- 
dicted that certain discrepancy scores 
should correlate better with some 
satisfaction variables than with 
others and data relevant to this pre- 
diction are presented in Table 3. Satis- 
faction Questions 5, 6 and 7 related 


to satisfaction with students, faculty, 
and administration respectively. We 
would expect correlations between 
Self-Students, —Self-Faculty, and 
Self-Administration discrepancies — to 
correlate best with the corresponding 
satisfaction variables. The data in. 
Table 3 indicate that while the dis- 
crepancy scores generally relate toa. 
variety of satisfaction items, 1n every | 
case the discrepancy score correlates - 
highest with the corresponding sat- 
isfaction variable. This holds for 
public and private colleges on both 
forms. 


i 
| 
Reliability i 

The test-retest reliability of the 
semantic differential has been fn 
to be quite high (Miron, 1961; ji 
good, Suci, & Tannenbaum, 1957, n 
126-143). Thus, in part the relia d 
ity of TAPE stands upon past Ww 
search on the semantic differential. 


reliability of TAPE as a measuring 
instrument. A reliability 
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completed on Form B by James 
Pedersen at South Dakota State Uni- 
versity. In the spring semester stu- 
dents completed TAPE Form B and 
then a sample of these completed rat- 
ings on three of the concepts about a 
month later. All Ss rated the Self 
concept while one-half of the sample 
also rated the College concept and the 
other half of the sample the Students 
concept. 

Data relevant to the reliability of 
TAPE Form B are presented in Table 
4. First the reliability of individuals 
can be looked at. The first column 
indicates the mean reliability of rat- 
ings for individual Ss on each of the 
three concepts. These were computed 
across 52 scales for each S rating a 
concept. The data are quite compa- 
rable to those reported by Lilly 
(1965), which represents the one 
other place this kind of reliability is 
reported for the semantic differential. 
There was considerable variation 
among individuals in their reliability 
correlations, Another reliability check 
for individuals is that on discrepancy 
Scores. If the individual had a large 
discrepancy score between a pair of 
Concepts on one occasion did he also 
have a large discrepancy score upon 
the second occasion? Product-mo- 
ment correlation coefficients were run 
for Self-College and Self-Student 
discrepancy scores, obtained for the 
same Ss from the two testing sessions. 
The correlation for the former was 
‘87 and that for the latter .95. The 
Conclusion can be drawn that the 
discrepancy scores show a quite high 
degree of reliability. Even if an in- 
dividual tends to vary in his scale 
Tatings, in general he tends to see 
about the same amount of discrep- 
ancy between two concepts when the 
time between the ratings is not very 
great. 

Turning to the scale ratings, the 


TABLE 4 
Pnopvcor-MoMENT CORRELATION. 
COEFFICIENTS INDICATING 
Rerasuty or TAPE 
Form B 


Concept 
Type of reliability 
Stu- 


College| selt | S. 


Mean of individual sub- 
ject reliabilities 59 .70 .58 

Mean of 52 scale reliabili- 
ies à 40 | .56 | 4T 

Concept reliability-across 
les + Ss „58 | .70 | .80 

Scale reliability-means for 
.98 .98 .98 


two samples 
. Scale reliability-means for 
test-retest 95 99 95 


veep op 
- 


Note.—For CRINES, N = 37; for Self, N = 75; for 


[oi 
Students, N — 3! 


second column in Table 4 indicates 
the correlation between two sets of 
ratings on the scales. Here the ratings 
for each scale are correlated across 
Ss, resulting in 52 correlation coeffi- 
cients for each concept. The correla- 
tions here are not high but are sim- 
ilar to those reported by Norman 
(1959). Actually, these correlations 
do not represent a fair appraisal of 
the scale reliability. For example, the 
first scale on the college concept had 
a product-moment correlation coeffi- 
cient of .14—quite low. Yet, 50% of 
the ratings were either the same or one 
position off and 75% of the ratings 
were two or fewer positions off. The 
problem is that if a scale has low 
variability in ratings, often a de- 
sirable characteristic, and some mi- 
nor amount of changes in ratings, 
the reliability estimate will tend to 
be low. 

Another kind of reliability is that 
of concept reliability. As a measure 
of this the ratings for the two oc- 
casions were correlated across scales 
and Ss. These correlations range from 
58 to .70. As a more global measure 
of the reliability of the ratings, the 
ratings from the two occasions were 
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TABLE 5 
Scare Factors AND SAMPLE SCALES 
DERIVED FROM THREE-MopE 
Factor ANALYSIS 


Factor Sample scales 

1, Impulsivity- 1. sober-intoxicated 
Inhibition disciplined-undisciplined 

2. Humane idealism- 2, humane-self-interested 
Narcissism idealistic-materialistic 

8. Warm-Cold 3. warm-cold 

sociable-unsociable 

4. Introversion- 4. introverted-extroverted 
Extroversion eggheadish-well-rounded 

5. Goal-directed 5. motivated-undirected 
activity idustrious-tranquil 

6. Liberal idealism- 6. social welfare-laissez faire 
Conservative socialistic-capitalistic 
pragmatism idealistic-materialistic 

T. Scholarship T. research-application 


scholarly-nonscholarly 


8. NOSE 8. relaxed-tense 
lienation optimistic-pessimistic 


9. Conventionality 9. religious-secular 
moral-amoral 

10. Creativity 10. artistic-p: tio 
esthetic task-oriented 

11. Sensitivity 11. feminine-masculine 
sensitive-insensitive 

12. Tradition 12. upperclass-middle class 
elegant-common 
traditional-traditionless 

18. Cosmopolitan- 13. cos! litan-provincial 

Provincial rban rural ^ 


correlated across concepts, scales, and 
Ss (N = 7,644). The result was a 
product-moment correlation coeffi- 
cient of .65. 

Finally, there is the reliability of 
the scale means. The scale means for 
a large sample were correlated with 
those of a smaller sample for each of 
the six concepts. The data for three 
concepts are given in the fourth col- 
umn of Table 4. The correlation co- 
efficients for the other three concepts 
and for the satisfaction items were 
either .98 or .99, indicating a high 
degree of reliability. Correlations 
between the scale means for the first 
time TAPE was administered and the 
scale means for the second testing 
session (test-retest reliability) are in- 


dicated in the last column in Table 4. 
These data are comparable to those 
reported by Jenkins, Russell, and 
Suci (1958) and clearly indicate that 
the scale means are quite reliable be- 
tween samples and testing sessions. 


Factor Structure 


A three-mode factor analysis was 
completed on the TAPE data across 
both forms. The procedure followed 
was the same as that in Levin's 
(1965) analysis, which was based on 
work by Tucker (1964). Whereas 
the usual semantic differential analy- 
sis examines only the scale-mode fac- 
tors, the three-mode factor analysis 
permits the investigator to explore 
the factors in the scale, concept, and 
subject (college) modes simultane- 
ously as well as the interrelations 
among these three sets of factors. 

In this study the factor analysis 
was completed across 104 scales, 6 
concepts, and 20 colleges. College 
means were used to represent the sub- 
ject mode, Fourteen scale factors, 
three concept factors, and three col- 
lege factors were derived. All con- 
cepts were important in the concept 
factors and the college factors sug- 
gested one for state colleges and two 
for private colleges. The latter 
seemed to consist of a group of elite 
conservative colleges and a group O 
elite liberal and less conventional col- 
leges. The addition of other colleges 
in the future will likely increase the 
number of college factors. 

The scale factors are given in Table 
5 along with two examples of scales 
with high loadings on the factors. The 
factors clearly cover a variety 9 
areas including those of modes of im- 
pulse expression, interests Or goals, 
and value orientations. A number of 


3The three-mode factor analysis WA? 
completed by Roy S. Lilly. 
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factors appear to be similar to those 
reported by Pace (1963) in the de- 
velopment of CUES. 


Discussion 


The data in this study have been 
presented in relation to the specific 
goals of the research. However, the 
data in relation to each goal have 
relevance for one another, for the 
theoretical rationale involved, and 
for the TAPE instrument as a whole. 
TAPE appears to hold considerable 
promise for intra- and interinstitu- 
tional research. The validity data 
suggest that discrepancy scores can 
be useful in institutional research. 
For example, mean discrepancy scores 
can be computed for each pair of 
concepts. Comparisons of these would 
suggest sources of strain in the func- 
tioning of the parts of the college or 
differences in the system functioning 
of different colleges. In fact, large 
mean discrepancy scores for col- 
leges have tended to be related to 
large mean dissatisfaction scores, & 
relationship already noted in relation 
to individuals. 

The discrepancy score-satisfaction 
data can also be used to assess the 
Scales or dimensions upon which the 
greatest discrepancies occur and those 
which are most related to dissatis- 
faction. An analysis of two schools 
along these lines suggests that sig- 
nificant areas of discrepancy and 
dissatisfaction vary from college to 
college and the relationship between 
Perceptions may vary within the 
Same area. For example, dissatis- 
fied students at one school saw the 
College as more conservative, less 
equalitarian, and less scholarly, and 
the self as more liberal, more equali- 
tarian, and more scholarly than did 
Satisfied students. On each of these 
Seales the relationship was reversed 
for dissatisfied students at the sec- 
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ond school; that is, at the second 
school dissatisfied students saw the 
college as more liberal, more equali- 
tarian, and more scholarly, and the 
self as less liberal, less equalitarian, 
and far less scholarly than did satis- 
fied students. Indeed, data such as 
these strongly support the conclusion 
of Douvan and Kaye (1962) that 
there is something wrong in the proc- 
ess by which students select colleges 
and that the time may come when we 
are able to arrive at a student-college 
fit which is most conducive to de- 
velopmental growth and change. 
These data do not suggest that homo- 
geneity of colleges or homogeneity of 
students within a college is best. 
Rather they suggest that there is an 
optimum fit between student and col- 
lege, the qualities of which will vary 
for different students and different 
colleges. Viewed in this light, this 
research should be useful in suggest- 
ing the transactions within the col- 
lege, or between students and parts of 
the college, that might be influenced 
in the direction of fostering student 
development. 

Cronbach (1958) has suggested 
that discrepancy scores may be less 
useful than using ratings of one or 
another concept alone. An intensive 
analysis of the data from one school 
suggested that this was not the case 
for the relationships reported here. 
Discrepancy scores appeared to ac- 
count for more of the variance than 
either self or college ratings. Fur- 
thermore, since the relationship be- 
tween particular scale ratings and 
satisfaction scores can vary from col- 
lege to college, relationships involv- 
ing discrepancy scores will likely 
have greater stability across schools. 

Another possible source of variance 
in the data investigated was that of 
a curvilinear relationship between 
size of discrepancy score and degree 
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of satisfaction. The data for a num- 
ber of the correlations were plotted 
but evidence for such a curvilinear 
relationship was not found. 

It is possible to argue that the 
satisfaction ratings were contami- 
nated by the ratings of the concepts. 
For this to be true, one would have 
to argue that ratings on the first three 
concepts biased the satisfaction rat- 
ings and the latter in turn biased rat- 
ings on the next three concepts. If 
this were the case, one would sus- 
pect a general bias in ratings across 
satisfaction items. Yet, discrepancy 
score-satisfaction relationships var- 
ied between academic and nonaca- 
demic kinds of satisfaction and de- 
pended upon the concept pair and 
satisfaction item involved. Specific 
discrepancy scores  (Self-Students, 
Self-Faculty, ^ Self-Administration) 
tended to have their highest correla- 
tions with the corresponding satisfac- 
tion items. Finally, in an earlier 
study (Pervin & Rubin, in press) 
similar relationships were found even 
though concept and satisfaction rat- 
ings were made a week apart. 

Results from the three-mode fac- 
tor analysis should be useful in fu- 
ture research with TAPE. Data can 
now be analyzed in terms of how 
each of the concepts for each college 
loads on the scale factors. This pro- 
vides intra- and  interinstitutional 
comparisons using factor scores 
rather than scale scores. Further- 
more, the development of factors now 
allows for an analysis of the relation- 
ship between semantic space dis- 
crepancy scores (Osgood et al., 1957) 
and satisfaction ratings. Also, the 
suggestion of Cronbach (1958) that 
the factor scores may vary in their 
relationships to the dependent vari- 
able can be investigated. 

Future research with TAPE can 
follow along a number of lines. Four 


areas can be specified: (a) intra- 
institutional variables affecting cor- 
relations. Here one may study dif- 
ferences between males and females, 
members of different colleges, or 
members of different college years. 
Some early analyses suggest that the 
relationships hypothesized hold best 
for the freshman year. (b) Inter- 
institutional variables affecting cor- 
relations. Here one may study the 
characteristics (size, complexity, etc.) 
of different colleges which affect the 
nature of the relationships. It has 
already been suggested that some 
scales and factors may be more im- 
portant for some institutions than for 
others. Also, some colleges may show 
greater tolerance for diversity and 
heterogeneity. (c) Personality vari- 
ables. It may be that some individuals 
are more tolerant of differences and 
are more flexible in adapting to them 
than others. (d) Instrument variables. 
Analysis of individual scale and fac- 
tor scores has already been suggested. 
Current research is also being directed 
toward an analysis of the direction of 
perceived discrepancies rather than 
just analyzing distance. A hypothesis 
being investigated here is that discrep- 
ancies which are perceived as helping 
the individual become like his ideal 
self are desirable whereas those which 
are perceived as taking him away 
from the ideal self are not. Other 
studies involving TAPE include the 
comparison of faculty ratings with 
student ratings and the analysis of 
changes in student perceptions over 
time. 

In contrast to other instruments, 
TAPE allows for the analysis of 
transactions among parts of the col- 
lege system and uses student percep- 
tions of areas of interest as oppose! 
to defining the characteristics rele- 
vant to the area for the student, (As- 
tin, 1965). The data presented are 


4 
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taken as supporting the theoretical 
rationale which led to the develop- 
ment of TAPE and the utility of 
TAPE in the study of Student x Col- 
lege interaction. 
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DIFFERENTIAL MEMORY FOR PICTURE AND 


WORD STIMULI 


JOSEPH R. JENKINS, DANIEL C. NEALE, 
University of Minnesota 
AND STANLEY L. DENO 
University of Delaware 


To compare the effects of using either pictures or words in a recogni- 
tion task, 120 college sophomores were assigned randomly to 4 treat- 
ment conditions: (a) see pictures—recognize pictures (PP); (b) see 
words—recognize words (WW); (c) see pictures—recognize words 
(PW); (d) see words—recognize pictures (WP). When the number of 
correct identifications of the original stimuli were compared, Group 
PP was superior to WW (p < 01), and PW was virtually the same as 
ww but significantly superior to WP (p < .025). The tendency to ad- 
mit instructions, identifying new stimuli as members of the initial se- 
ries, was significantly more predominant in the WW condition than 
in the PP condition (p < .001). Results were discussed in relation to 
hypotheses about the nature of encoding processes for pictures and 


words. 


The present study compared the ef- 
fects of using either pictures or words 
in a recognition task. 

A number of investigators have 
shown that pictures and words func- 
tion differently as stimuli in learning 
tasks, Herman, Broussard, and Todd 
(1951), using either pictures or words 
to represent the same common ob- 
Jects, demonstrated that pictures are 
learned faster than words in a serial 
anticipation task. Lumsdaine (1949), 
Deno (1965), and Paivio and Yarmey 
(1966) all found that when pictures 
Were used as stimuli in a paired-as- 
Sociate task, learning was faster than 
when words were the stimuli. Du- 
charme and Fraisse (1965), testing 
written recall of words and pictures, 
Teported a tendency for children to 
Tecall more words than pictures but 
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for adults to recall more pictures than 
words. However, neither tendency 
was statistically reliable. 

A difficulty exists, especially in 
serial anticipation and recall tasks, 
in contriving a fair measure of learn- 
ing for the picture condition. In the 
above studies memory for pictures 
was measured by having the sub- 
ject (S) say or write the word which 
labeled the picture. This procedure 
may have introduced an extra trans- 
formation for Ss who learned pic- 
tures compared to those who learned 
words. 

The present study sought to pro- 
vide a fairer measure of memory by 
using a recognition task in which 
stimuli were either words or line 
drawings representing common ob- 
jects. Since in a recognition task the 
measure of memory is S's response, 
“Yes, I have seen it,” or “No, I have 
not seen it,” no advantage accrues to 
Ss whether the stimuli to be recog- 
nized are pictures or words. 

In addition, the present study 
sought to test hypotheses about Ss’ 
abilities to make the transition from 
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picture to word stimuli and vice 
versa. Therefore, one condition was 
included in which Ss saw a set of pic- 
tures and then were asked to identify 
words which represented the pictures, 
and a second condition was included 
in which Ss saw a set of words and 
then were asked to identify the pic- 
tures. In this way the importance of 
changing stimulus modes could be 
assessed and information could be 
gained about how Ss encoded the two 
kinds of stimuli. 

Several hypotheses guided the in- 
vestigation. First, on the basis of 
previous research, pictures were ex- 
pected to be more easily remembered 
than words. Therefore, the perform- 
ance of a “see pictures—recognize 
pictures” group was expected to be 
superior to a “see words—recognize 
words” group. 

Second, recognition was expected 
to be better when stimuli in the two 
presentations were in the same, rather 
than different, modes as in the ini- 
tial presentation. However, for the 
highly labelable pictures used in the 
study, the tendency to encode pic- 
tures verbally was expected to be 
strong. Therefore, performance of 
a “see pictures—recognize words” 
group was expected to be equal, or 
nearly so, to that of a “see words— 
recognize words” group. 

Third, a “see words—recognize pic- 
tures” group was expected to show 
poorest performance relative to other 
groups. Such a group would have the 
disadvantage of recognizing stimuli in 
a different mode without the help 
that verbal labeling gives to the “see 
pictures—recognize words" group. 
Presumably, the probability is re- 
mote that Ss seeing the words would 
encode them with an image to match 
the pictures in the experiment. 


MzrHoD 
The design of the study included four 
conditions: (a) See  pictures—recognize 


pictures (PP); (b) See words—recognize 
words (WW); (c) See pictures—recognize 
labels of the pictures (PW); (d) See 
words—recognize pictorial representations of 
the words (WP). 

The Ss were 120 University of Minnesota 
male and female students enrolled in in- 
troductory psychology classes. The Ss were 
randomly assigned to one of the treatment 
conditions which were administered to 
groups varying in size from eight to 20. 
Each condition ultimately contained 30 Ss. 

Stimuli were either pictures or words of 
42 common objects. The pictures were 
simple black-white line drawings, chosen 
because they were easily and consistently 
labeled. Both pictures and words were 
separately reproduced on 35-mm slides. 
The slides were projected onto a movie 
screen by a Carousel projector adjusted with 
two Hunter timers so that stimulus pre- 
sentation was 1.5 seconds per item with a 
1.5-second interstimulus interval. Mimeo- 
graphed data sheets were prepared for use 
by Ss in rating. 

The experimenter (E) instructed Ss that 
they would see a series of pictures and 
words, according to the condition, and that 
they should remember them. In addition, 
Ss were cautioned that the order of pres- 
entation was not significant. Then E pro- 
jected the 17 stimuli in the learning list 
(17 L) onto the screen. The 17 L included: 
BED, STAR, NAIL, RADIO, TABLE, TIE, NOSE, DOOR, 
PIE, CHURCH, CAT, WOMAN, BREAD, STAIRS, FISH, 
KNIFE, and BULB. Following their presenta- 
tion, E distributed the mimeographed sheets 
and instructed the Ss that they would see 
a longer series of stimuli that would include 
those originally seen. The Ss were told to 
respond to each stimulus of the second 
series by marking a five-point rating scale 
accordingly: 1—the stimulus was definitely 
not from the original series; 2—I believe 
the stimulus was not from the orgmn: 
series; 3—I do not remember if the stimulus 
was in the original series; 4—I believe the 
stimulus was from the original series; 5—the 
stimulus was definitely in the original series. 
After an example the Ss saw 42 slides con- 
sisting of the 17 L slides and 25 additional 
slides (25 X) that were not included in the 
initial series. Those Ss in conditions that 
changed stimulus mode (PW and WP) were 
told that the second series would contain 
the original 17 L slides but that their form 
would be changed. This was oe P do 
example. Each group receive e 
random order of the 17 L slides, and also the 
same random order of the 42 test slides. 


The 17 L slides were associatively unrelated 
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while the 25 X slides were associatively 
related both to each other and to 17 L 
slides. For example, srAR was an original 
stimulus while Moon and sun were members 
of the test list along with STAR. 


RESULTS 


Three specific comparisons were of 
primary interest: (a) PP versus 
WW; (b) WW versus PW; and (c) 
PW versus WP. The significance of 
difference in each of the contrasts 
was tested with a t-test.? 

When the number of correct identi- 
fications of the 17 L stimuli (num- 
ber of 4's and 5's for 17 L) are com- 


TABLE 1 
Means AND STANDARD DEVIATIONS FOR 
NUMBER or Correct IDENTIFICATIONS 
or THE 17 L STIMULI 


Group M SD t 
PP | 15.73 | 117 e 
WW | 1428 | 2.16 | 208 
PW | 16m | 243 | $09. 
we | i287 | 3.15 | 7 


Note.—MSE = 5.47. 
*p < .025. 
M. nie 0l. 


pared (Table 1), PP is superior to 
Ww, p < 01, and the mean for group 
PW is virtually the same as WW but 
rcu superior to WP, » < 
An investigation of the number of 
correct, identifications for the 25 X 
stimuli (number of 1’s and 2's for the 
25 X stimuli) again demonstrates the 
Superiority of the PP condition. The 
Ss in the PP group were able to re- 
re Tene 
i *A two-way analysis of variance might 
ave been used, except that such a procedure 
poau fail to provide direct tests of the 
Ypotheses under consideration. Instead t- 
ratios were computed using in each de- 
Dominator a pooled estimate of the variabil- 
ity over the four conditions. This quantity 
I3 equivalent to mean square error (MSE) 
im the analysis of variance. This procedure 


E Dd to one suggested by Winer (1962, 
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TABLE 2 
MEANS AND STANDARD DEVIATIONS FOR 
NUMBER or Correct IDENTIFICATIONS 
or THE 25 X STIMULI 


Group M SD n 
PP 22.97 | 2.46 

ww | 19.27 | 4.76 | 3-2" 
PW 18.60 | 434 | 085, 
WP 17.17 | 4.05 à 
Note.—MSE = 15.98. 

* 10> p > .05. 
** p < .001. 


spond “not there" more accurately 
than Ss in the WW group, p < .001. 
Again, conditions WW and PW were 
nearly identical in their performance 
though PW did not perform signifi- 
cantly better than WP, .05 < p < 
.10, the means fell in the predicted 
direction (see Table 2). 

The tendency to admit intrusions, 
identifying 25 X stimuli as members 
of the initial series (number of 4’s 
and 5’s for 25 X), is significantly 
more predominant in the WW con- 
dition than in the PP condition, p 
< .005 (see Table 3). In line with 
the above findings, condition PW 
does not differ from WW and again 
is in the predicted direction relative 
to PW, .05 < p < 01. 

A. final analysis that compares the 
overall certainty with which Ss in the 
four conditions respond is summa- 
rized in Table 4. Group PP responded 


TABLE 3 
MEANS AND STANDARD DEVIATIONS FOR 
NuMBER or INCORRECT IDENTIFICATIONS OF 
THE 25 X STIMULI (INTRUSIONS) 


Group M SD D 
PP 1.03 1.59 

ww | 2.99 | 245 | gis 
PW 3.50 2.89 1.59* 
WP 4.50 2.64 : 
Note.—MSE = 5.96. 

* 10 > p > .05. 


** p < .005. 
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TABLE 4 


MEANS AND STANDARD DEVIATIONS FOR 
DEGREE OF CERTAINTY OF 


IDENTIFICATIONS 
Group M SD t 
PP 39.03 4.03 2.89* 
ww 35.13 5.37 0.45 
PW 34.53 5.59 0.89 
WP 33.33 5.73 ` 
Note.—MSE = 27.29. 
* p < .005. 


with absolute certainty (number of 
1’s and 5’s for the entire 42 stimuli) 
significantly more írequently than 
group WW, p « .005. 


Discussion 


The data confirm the hypothesis 
that pictures are more easily remem- 
bered than words. Several factors 
may contribute to this effect. Two 
possible explanations follow. 

When a stimulus is encoded, both 
a nonlinguistic and a linguistic repre- 
sentation of the stimulus may be 
stored, The nonlinguistic representa- 
tion is in some sense richer for pic- 
tures than for words. When the stim- 
ulus is a picture, for example, of 
BOY, distinctive cues may be pres- 
ent, perhaps a distinctive facial ex- 
pression or distinctive clothing, These 
cues are irrelevant as far as identify- 
ing the stimulus as BOY, but never- 
theless they enrich the nonlinguistic 
representation that is stored. This 
richness helps S to discriminate among 
picture stimuli, On the other hand, the 
word BOY has few distinctive charac- 
teristics that help Ss to discriminate 
among words. There is little reward 
for attending to the formal attributes 
of a word, for example, its shape, since 
these attributes infrequently aid dis- 
crimination in memory tasks such as 
those under consideration. In fact, the 
formal similarity of the words may be 
a source of interference (Underwood, 
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1953). In short, pictures contain dis- 
tinctive cues which make them more 
discriminable than their labels and 
this discriminability enhances mem- 
ory for pictures compared to their 
labels. 

Even if one is reluctant to admit 
that a nonlinguistic representation is 
stored, it is likely that the additional 
cues are stored at least verbally, so 
that the picture BOY is encoded and 
stored as BOY—WITH BOOKS IN 
HAND, SMILING, LOOKS LIKE A 
SCHOOL BOY, WALKING, AGE 
APPROXIMATELY EIGHT. On 
the other hand, the additional cues 
for the word BOY are less conspicu- 
ous: BOY—PICA TYPE, ALL UP- 
PER CASE LETTERS, THREE 
LETTERS. Moreover, the fact that 
the other stimuli have these same at- 
tributes only increases the similarity 
dimension when Ss are required to 
choose the original stimulus from the 
test list. 

An alternative hypothesis relies on 
the associative characteristics - of 
pietures and words. The same stim- 
uli that were used in the present ex- 
periment were formerly employed in 
a comparison of free associations to 
pietures and words (Deno, Johnson, 
& Jenkins, 1966). The free-associa- 
tion data suggest that associations to 
a word are different from associations 
to its pictorial representation. More- 
over, the general findings indicated 
that associative clumping, as meas- 
ured by associative overlap, was of 
lesser magnitude between sets of 
words than between sets of pictures. 
Consequently, if presentation of à 
stimulus with instructions to remem- 
ber elieits associations to that stim- 
ulus, then there is smaller chance 
that associative interference will at- 
crue in the PP condition than in the 
WW condition. Unfortunately, an m- 
vestigation of this hypothesis 18 a 
tremely difficult with the presen 
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data: since Ss on the average missed 
less than two stimuli in condition PP. 

Comparisons of the performance of 
groups PW and WW support the 
hypothesis that pietures are encoded 
and stored with their verbal label. In 
each comparison of the two groups, 
the means of the two groups are 
highly similar. Somewhat surprising 
is the apparent lack of a decrement 
in the performance of the PW group 
attributable to the change in stim- 
ulus mode. It may be concluded that 
recognizing the word labels of pic- 
tures is as easy as recognizing the 
words themselves. The Deno, John- 
Son, and Jenkins data suggest that 
the lower associative overlap be- 
tween pictures as compared with 
words may also contribute to the 
performance of the PW group by a 
reduction in the associative inter- 
ference during the learning stage. 
From this one would anticipate a re- 
duction of the intrusions for the PW 
group in relation to the WW group, 
Table 3. The data do not confirm 
this expectancy. Apparently, if the 
associative factor contributes to the 
performance of any groups, both 
series must be in the same stimulus 
mode. 

The WP condition had the greatest 
difficulty identifying the original 
stimuli. While the PW group appar- 
ently stored the stimulus verbally, 
the WP group had little chance to 
store the stimulus in the form in 
which it would be tested. 

The results have some implications 
for foreign language instruction. Fre- 
quently, attempts are made to elimi- 
nate initial language habits by hav- 
ing students learn the meanings of 
foreign words through pairing these 
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words with pictures rather than their 
initial language equivalents. The as- 
sumption is that the learner will ac- 
quire the meaning of the foreign 
word directly as meaning was ac- 
quired within the native language. 
The data from the PW condition sug- 
gest, however, that Ss tend to en- 
code a picture verbally within the 
native language anyway, so that at- 
tempts to eliminate previously estab- 
lished verbal habits may well be un- 
successful. 
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EFFECTS OF CONCEPTUAL SIMILARITY ON SERIAL 
LEARNING AND RETENTION BY RETARDATES 


ALFRED A. BAUMEISTER 4x» JUDITH GUFFIN 
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2 groups of retardates learned 8-item serial lists under 2 conditions 
of item similarity. 1 list was composed entirely of pictures of animals. 
The other contained pictures of common objects from different con- 
ceptual eategories. Association values were empirically matched for 
items in the 2 lists. Ss were run to a criterion of 1 perfect recita- 
tion of the items, After 48 hr. Ss were requested to relearn the 
same list. The results indicated that the conceptually related items 
were more difficult to learn. Relearning was not significantly affected 
by the treatments. Typical serial-position curves were found. The 
distribution of percentages of serial-position errors did not differ sig- 


nificantly over the 2 conditions. 


Although intralist similarity is gen- 
erally regarded as an important de- 
terminant of verbal learning rate 
(Underwood, 1963), the evidence for 
the effects of this variable in the per- 
formance of retardates is somewhat 
equivocal. Iscoe and Semler (Iscoe 
& Semler, 1964; Semler & Iscoe, 
1965) have reported that, for re- 
tarded subjects (Ss), conceptually 
similar items in the paired-associate 
(PA) task are more difficult to learn 
and retain than dissimilar items. On 
the other hand, Wallace and Under- 
wood (1964) found only a slight and 
nonsignificant difference in perform- 
ance on PA lists of high and low sim- 
ilarity for retarded Ss. They specu- 
lated that less intelligent Ss are 
deficient in their ability to make ap- 
propriate implicit associative re- 
sponses, 

The major purpose of the present 
study was to determine whether con- 
ceptual similarity of items in a ser- 
tal list would affect the learning and 
long-term retention of these items by 
institutionalized retardates. Serial- 


1 Supported, in part, by United States 
Public Health Service grant HD-02588. The 
authors wish to express gratitude to the 
patients and staff at Partlow State Hospital 
and School for their cooperation. 


learning rate should be particularly 
sensitive to this contextual variable. 
There is, moreover, some evidence to 
suggest that serial learning is rela- 
tively more difficult than PA learning 
for less intelligent individuals (Bau- 
meister, 1967). An ancillary purpose 
was to examine serial-position effects 
in relation to intralist similarity. 


Mertuop 


Subjects 


Forty-eight Ss with IQs between 50 and 
85 were obtained from a residential institu- 
tion for the mentally retarded. All Ss with 
gross sensory or motor defects were ex- 
cluded from the study. The mean IQ (Stan- 
ford-Binet) was 614; the mean chronologi- 
cal age was 21.7 years. The corresponding 
standard deviations were 8.1 and 4.7 years. 


Learning Material 


The test materials were selected from a 
set of 46 colored pictures of common ob- 
jects. Preliminary to the experiment, a880- 
ciation values were computed for each pic- 
ture? These values consisted of the median 
latencies to identify the object in the pic- 
ture by 40 Ss from the population sampled 
in the main experiment. Two lists, each 
containing eight items, were composed from 
the parent set of 46 pictures. Items in the 


2 Gratitude is expressed to Norman Ellis 
for supplying the data from which these 
values were obtained. 
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iwo lists were matched closely on associa- 
tion values. One of the lists (high similar- 
ity) consisted exclusively of pictures of ani- 
mals: rabbit, duck, fox, cow, sheep, tiger, 
monkey, and donkey. The other list (low 
similarity) comprised pictures of objects 
Írom different conceptual categories: train, 
flag, church, glove, fruit, grasshopper, pie, 
and ring. 


Procedure 


The Ss were randomly assigned to one 
of two subgroups. Tests of significance indi- 
cated no mental or chronological age bias in 
the composition of these subgroups. One 
group of Ss was required to learn the list 
composed of conceptually similar items and 
the other received the dissimilar items. 

In order to minimize the effects of con- 
founding of item position, the items in each 
list were systematically presented in orders 
following a Latin-square arrangement. Eight 
different orders, with three Ss under each, 
were employed, thereby insuring that every 
item appeared in every position an equal 
number of times. 

The stimuli were presented on a screen 
using à Kodak Carousel projector. On the 
first presentation of the list, S was required 
io name the objects. Items were presented 
for 4 seconds. A blue slide appeared on the 
screen for 4 seconds between items. The S 
was instructed to anticipate the next pic- 
ture during this interitem interval. An in- 
tertrial interval of 20 seconds was employed. 
The criterion of learning consisted of one 
Perfect recitation of the items, Forty-eight 
hours later, S relearned the list to the same 
criterion, 


RESULTS 


The entire analysis was based upon 
errors. Summary data appear in Table 
1. A t-test indicated (p < .01) that 
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the list composed of conceptually re- 
lated items was more difficult to learn. 
No differences were found with re- 
spect to percentage savings in re- 
learning the list after a 48-hour in- 
terval. A more detailed examination 
of the types of errors made under the 
two conditions indicated the list of 
similar items produced somewhat 
more frequent overt within-concept 
errors. However, proportions of in- 
trusion and omission errors were not 
significantly different for the two 
groups. A statistical comparison of 
the serial-position curves under the 
two treatment conditions revealed no 
differences. 


Discussion 


The results of the present experi- 
ment partially corroborate those re- 
ported earlier by Iscoe and Semler 
(Iscoe & Semler 1964; Semler & Iscoe, 
1965) with respect to the detrimental 
effects of high conceptual similarity 
of items in structured verbal learning 
tasks. This effect seems to apply to 
the serial as well as the paired-asso- 
ciate learning of retarded Ss. 

The theoretical analysis proposed 
by Wallace and Underwood (1964) 
would predict some of these findings. 
In brief, they have speculated that 
two implicit responses occur when a 
stimulus is presented in a verbal 
learning context. One of the responses 
involves the act of perceiving the 


TABLE 1 
Mean ERRORS ror SIMILAR AND DISSIMILAR CONDITIONS BY ITEM POSITION AND TOTALS 
FOR ORIGINAL LEARNING (OL) AND PERCENTAGE OF SAVINGS SCORES FOR 
RELEARNING (RL) ArrER 48 Hours 


OL position 


Group Total percentage 

i à 3 3 z : 5 A of savings 
Similar 3.0 | 4.1 | 4.7 | 4.8 | 5.0 | 5.2 | 4.4 | 3.2 | 344 86 
Dissimilar | 2:3 | 2.5 | 2.9 | 3.1 | 3.6 | 3.8 | 3.3 | 19 | 23.4 80 


Note.—N = 24 for each group. 
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stimulus and is described as a “rep- 
resentational response" (RR). The 
other covert response is said to be 
produced by the stimulus properties 
of the RR. This second response is 
labeled the “implicit associative re- 
sponse” (IAR) and consists of all the 
verbal associations called forth by 
the stimulus. Mediators are viewed 
as one type of IAR. When items 
which elicit the same IARs must be 
learned in some order, fewer differen- 
tiating cues are available and per- 
formance should be impaired. Thus a 
list composed of items from the same 
conceptual category should be more 
difficult to learn than a list of items 
from several different categories. The 
data obtained here are generally in 
agreement with this view, although 
one might have expected a greater 
proportion of within-concept intrusion 
errors under the condition of high 
similarity. 

Relearning of the serial lists was 
not markedly different under con- 
ditions of high and low categorical 
similarity. However, little variabil- 
ity was noted in the relearning scores. 
Possibly, the effects of conceptual 
similarity upon retention are related 
to the degree of original learning. It 
should be noted that Semler and 
Iscoe (1965) found more long-term 
forgetting of paired associates when 
the PA sets were conceptually similar. 

Although some wide interindivid- 
ual differences were noted, typical 
group serial-position curves (nega- 


tively skewed and bow-shaped) were 
found under both treatment condi- 
tions. The invariance principle re- 
garding serial-position effects in this 
type task (MeCrary & Hunter, 1953) 
appears to hold for contextual factors. 
Although increasing similarity of 
items affected overall difficulty, the 
distribution of percentages of serial- 
position errors did not differ signifi- 
cantly with respect to either bowing 
or skewing for the two treatment 
conditions. 
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INDIVIDUAL DIFFERENCES IN MEMORY SPAN 
WITH AND WITHOUT ACTIVITY INTERVENING 
BETWEEN PRESENTATION AND RECALL’ 


ARTHUR E. WHIMBEY anp SANDRA L. LEIBLUM 
University of Illinois 


30 Ss were tested on 3 memory tasks. 1 of the tasks was a form of 
the traditional memory-span test. The other 2 tasks required the 
Ss to engage in some activity between the presentation and recall of 
the digit series. All 3 tasks were found to be reliable measures of in- 
dividual differences. Furthermore, although the presence and type of 
activity intervening between presentation and recall was found to 
have a significant effect on the difficulty of the task, the intercorrela- 
tions. of the 3 tasks were very high. It was concluded that inter- 
spersing activity between presentation and recall does not change 
the psychological processes involved, but only affects the difficulty 


of the tasks. 


. There has recently been renewed 
interest in short-term memory 
(STM) as an important human abil- 
ities variable. Ellis (1963) postulates 
that mental retardates are deficient 
In memory-trace strength. Baumeister 
and his co-workers (Baumeister & 
Bartlett, 1962, 1963) have found 
Memory trace to be an important 
factor in the performance of retar- 
dates on the Wechsler Intelligence 
Scale for Children and have related 
Memory span to learning tasks. Jen- 
sen (1964) pointed out that one of 
the main reasons that memory-span 
tests tend to have low correlations 
with other measures of intelligence 
is that most span tests are unreliable. 
He showed that the span tests of the 
Wechsler Adult Intelligence Scale 
(WAIS) correlated very highly with 
the WAIS vocabulary test when the 
Correlation was corrected for attenu- 
ation. 

Researchers in the area of verbal 
learning have studied STM exten- 
sively (e.g., Bruning & Schappe, 1965; 
ee 1960; Peterson & Peterson, 
1959) but have been only interested 

"This research was supported by a grant 


from the University Research Board, Uni- 
versity of Illinois. 


in parameters that affect group 
means. Typically these studies re- 
quire the subjects (Ss) to engage in 
some activity between presentations 
and recall in an attempt to prevent 
rehearsal. Most pertinent to the pres- 
ent study, Bruning and Schappe 
(1965) have shown that the amount 
of forgetting depends upon the type 
of intervening activity, and Conrad 
(1960) found that merely having S 
say the word ^O" before trying 
to repeat a series of eight single-digit 
numbers resulted in a large decre- 
ment of percentage correct, However, 
little research has been performed 
investigating the stability of individ- 
ual performance under different, con- 
ditions of intervening activity. In- 
dividual differences in resistance to 
interference might actually be unre- 
lated to individual differences in 
memory ability when no intervening 
stimuli are present; the intercorrela- 
tions among memory tasks involving 
different conditions of intervening ac- 
tivity might be quite low. 

In view of the increasing impor- 
tance attributed to STM by research- 
ers interested in individual differences 
and the extensive amount of research 
being done on STM by learning psy- 
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chologists, it seemed desirable that: 
(a) reliable tests of memory span be 
developed; (b) group-administered 
tests be devised; (c) an investigation 
be made of the relationship between 
individual differences in memory span 
with and without activity intervening 
between presentation and recall. 


METHOD 


Subjects 


The Ss were 30 undergraduates who were 
paid for participating in the experiment. 


Tasks 


Each S was tested on three tasks. The 
tasks involved recalling series of single-digit 
numbers. The series were between 5 and 9 
digits in length and were presented by a 
tape recorder at the rate of 2 digits per 
second, This rapid presentation rate pre- 
vented rehearsal or the use of other memory 
techniques between the presentation of each 
digit. In a previous unpublished study em- 
ploying a 1 digit per 1.5 seconds rate, dif- 
ferent Ss used different techniques, so that 
the scores measured individual differences in 
techniques as well as individual differences 
in memory span. 

Each task consisted of 30 trials. In each 
block of five trials, one 5, 6, 7, 8, and 9 
digit series were presented in random or- 
der. The three tasks differed in the activity 
which intervened between presentation and 
recall of the series. 

Task 1. The Ss were instructed to at- 
tempt to write the series of digits as soon as 
the presentation was completed. 

Task 2. In this task, Ss were told that 
after the series was presented, they were to 
first write the digit "0" and then try to 
write the digit series. 

Task 8. For this task, Ss were given a 
Sheet with the following word-letter (W-L) 
pairs: bear-Q, bird-L, crab-V, fish-H, frog- 
R, flea-J. The presentation of each series of 
digits was followed by one of the six 
words. The S was to look at the W-L sheet, 
write the letter corresponding to the word, 
and then attempt to write the digit series. 


Procedure 


The Ss were tested in groups of about 10. 
Three groups of Ss were used, and the 
order of presenting the three tasks was 


counterbalanced across groups. A 10-minute 
rest break and a 10-minute syllogism test 
intervened between each task. 

In scoring the tasks, a series was consid- 
ered correct if it was reproduced in the 
order presented. The S’s score was deter- 
mined by alloting 5 points for each 5-digit 
series correct, 6 points for each 6-digit se- 
ries correct, etc. 


RESULTS 


To evaluate the effect of the dif- 
ferent intervening activities on the 
amount of recall and to assess the ef- 
fects of practice, an analysis of var- 
iance was performed. Counterbalanc- 
ing the presentation order of the three 
tasks for the three groups resulted in 
a Latin-square design. An equal num- 
ber of Ss did not arrive at each of the 
three testing sessions. Furthermore, 
two Ss did not perform all three tasks 
properly (e.g., did not write *0"). To 
equalize the group frequencies, sev- 
eral Ss were randomly discarded, 
and the analysis was based on seven 
Ss per group. 

The mean performance on the re- 
spective tasks were: Task 1, 137.2; 
Task 2, 107.6; Task 3, 583. The 
task effect was found to be significant 
(F = 8201, df = 2/36, p < O01), 
but neither the practice effect 
(F < 1), the sequence effect (F < 1), 
nor the residual (F « 1) produced 
significant differences. Previous un- 
published research based on the in- 
trospective reports of Ss suggested 
that writing “0” could be done auto- 
matically without interfering with 
subvocal rehearsal of the digit series. 
To determine whether writing “0” did 
have an effect, a test was made on 
the difference between the means on 
Task 1 and on Task 2. The difference 
was found to be significant (F — 
22.61, df = 1/36, p < .01). 

To determine whether the three 
tasks were measuring the same ST. 
process, the intercorrelations of the 
tasks were obtained and are presente 
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TABLE 1 


INTERCORRELATIONS AND RELIABILITIES OF 
THE TASKS 


Task 1 2 3 
1 .913 .835 .803 
2 .881 N 
3 877 


in Table 1. The r between Tasks 1 
and 3 was based on 30 Ss; the other 
two rs were based on 29 Ss. An esti- 
mate of the reliability of each of the 
tasks was made by correlating the 
scores from Blocks 1, 3, and 5 with 
the scores from Blocks 2, 4, and 6, 
and then using the Spearman-Brown 
formula for doubled length. The re- 
sulting values are presented on the 
main diagonal of Table 1. 


Discussion 


_ The significant effect of tasks found 
in the present experiment replicates 
previous findings (e.g., Conrad, 1960) 
that intervening activity tends to de- 
crease recall. The technique of includ- 
ing a brief period of activity between 
the presentation and recall of some 
information has been used exten- 
sively since it was first introduced by 
Peterson and Peterson (1959). The 
important finding of the present ex- 
Periment is that the three tasks cor- 
related highly with each other (the 
correlations would be even higher if 
corrected for attenuation), despite 
the different intervening activities. 
Since Ss tended to maintain their 
relative positions on the tasks, the 
common practice of only comparing 
the mean performance on tasks in- 
volving different intervening activity 
is legitimate. It appears that inter- 
Spersing activity between presenta- 
tion and recall affects only the dif- 
ficulty of the task but does not change 
the psychological processes involved. 


If this were not true, a much more 
extensive analysis of the behavior 
would be necessary. Furthermore, a 
learning theory describing perform- 
ance on these tasks would only need 
a level parameter to account for indi- 
vidual differences. Complicated task- 
by-subject functions would not be 
necessary. 

Further support for the conclusion 
that the intervening activity affected 
only difficulty is provided by com- 
paring the difference in diffieulty level 
of pairs of tasks with the respective 
correlations. There is no trend for the 
correlation between tasks to decrease 
as the difference in difficulty increases 
(Tasks 1 and 3 are further apart in 
difficulty than Tasks 2 and 3, yet the 
T between Tasks 1 and 3 is .803, 
whereas the r between Tasks 2 and 3 
is only .735). 

The absence of a significant prac- 
tice effect indicates that very little, 
if any, improvement with practice 
occurred in the experiment. Improve- 
ment with practice often indicates 
that S devised a strategy or tech- 
nique to help him remember the ma- 
terial (Gates & Taylor, 1925). The 
absence of a significant effect sug- 
gests that it was impossible to develop 
elaborate recall strategies in the ex- 
perimental tasks. The tasks assess 
short-term memory, not recall strate- 


es. 

Finally, it has been demonstrated 
that the measurement of memory 
span is not intrinsically unreliable, 
Good measures of span can even be 
obtained in a group-testing situation. 
The relationship between memory 
span and other general intelligence 
variables should now be obtained di- 
rectly, rather than being estimated 
by correction formulas. To what- 
ever degree memory span is found to 
relate to other psychometric meas- 
ures, the constructs and relationships 
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discovered in experimental learning 
studies should effect performance on 
these measures. 
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PITCH DISCRIMINATION AMONG PRIMARY SCHOOL 
CHILDREN' 


ORPHA K. DUELL an» RICHARD C. ANDERSON 
University oj Illinois 


The pitch discrimination performance of 168 Ist, 2nd, and 3rd graders 
was investigated in a natural classroom setting. Ss judged pairs of 
pure tones presented by a tape recorder as either the same or different. 
The standard ranged from 390 to 440 cps while the interval between 
the standard and the comparison ranged from V$ of a half-step to a 
major sixth. 68% of the Ss discriminated intervals as large as and 
larger than a half-step; however, 4% did not discriminate differences 
as large as a sixth. Performance improved from the Ist to the 3rd 
grade. Performance did not change within the 30-minute testing 


period. 


_The ability to discriminate between 

different pitches is recognized as nec- 
essary for the appreciation and pro- 
duction of music. Much attention 
has been devoted by psychophysicists 
to the measurement of a difference 
limen (DL), that smallest pitch dif- 
ference a person can discriminate ac- 
curately and reliably, for various 
areas along the pitch continuum for 
adult subjects (Ss). No such exhaus- 
tive effort has been made for younger 
Ss. Books for the preparation of fu- 
ture music teachers in elementary 
schools admonish them to draw at- 
tention to the direction of melodic 
movement and at least one book 
(Bergethon & Boardman, 1963) sug- 
gests that by the end of the second 
grade Students should be able to 
“identify melodic movement as mov- 
Ing by steps and skips.” Using the 
Seale as a reference point this would 
imply that Ss are expected to dis- 
criminate differences as small as a 
half-step, which would be well within 
their ability if one generalizes from 
adult data; however, data from a 
pore E 


thea Teport is based upon a master’s 
Jess submitted by the first author and 
Q ected by the second. We are indebted to 
Jordon Greenberg for advice and for as- 

sistance recording the tapes and to Robert 
take for statistical advice. 


study conducted in Russia (Repina, 
1961b) suggest that Ss of this age 
might not discriminate intervals of 
this size. Repina (1961b) measured 
.the DL for a group of 40 preschool 
Ss ages 3 to 7, of which 10 were in 
the 6- to 7-year-old group. Of these 
10 Ss one was reported to have a 
threshold "greater than a sixth," four 
had a threshold of a minor third to a 
perfect fourth, and five had a thresh- 
old of a half-step to a major second. 
These DLs were obtained using piano 
tones whereas Repina (19618) found 
even larger DLs when using pure 
tones, which are generally used in 
adult DL studies, with a different 
group of eight Ss. 

The purpose of the present study 
was to discover how large a pitch in- 
terval primary school children differ- 
entiate in the natural classroom set- 
ting, since this is the setting in 
which they produce and listen to 
music. If the Repina (1961a, 1961b) 
data are representative of young 
children's piteh discrimination per- 
formance, then school musie teach- 
ers should not assume that children 
ean already hear pitch differences of 
the size found in music (the half- 
step) and should concentrate on im- 
proving their students' abilities to 
hear such differences before expecting 
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them to enjoy listening to and sing- 
ing songs containing such intervals. 


METHOD 


Sample 


The Ss were 168 children from the first, 
second, and third grades in a midwestern 
school. None of the Ss had been identified in 
the regular school testing program as suf- 
fering from hearing loss. 


Equipment 


The stimuli were pure tones generated by 
oscillators, recorded on tape, and presented 
to the Ss by means of a Wollensak Model 
T1616 tape recorder augmented with a 12- 
inch external speaker. It is recognized that 
by the time the sounds were recorded and 
released in the room, they could no longer 
be considered "pure." No attempt was made 
to employ a standard intensity level since 
a measure taken at 7 feet from the speaker 
would yield one reading while a measure 
taken at 8 feet in the same direction would 
yield a different reading. The volume control 
was set so that the stimuli could be easily 
heard in all areas where Ss sat, The intensity 
level was probably somewhere in the range 
of 50-60 decibels sound pressure level. All 
groups of Ss were tested in the same room 
with a maximum of 31 Ss per group. 


Procedure 


The method for presenting the stimuli 
was one in which the comparison tones are 
randomly arranged and the standard gradu- 
ally changed through a small range. This 
method was found to be the most sensitive 
of 17 different methods compared by Harris 
(1949), The first step in this procedure is to 
choose a range within which to vary the 
standard (the tone to which the other tone, 
the comparison, is to be compared) that is 
small enough so that one can assume that 
the DL will not be different at the two 
extremes. Harris (1949) varied his standard 
from 750 to 850 cps, a range of 100 cps. The 
range of interest in the present experiment 
was the singing range of the Ss, middle C 
(261 cps) to the E an octave and a third 
above middle C (659 cps). The middle range 
within this broader range was explored. A 
100 eps range would not have been ap- 
propriate at this point in the pitch con- 
tinuum since the DL measured in eps be- 
comes progressively smaller as one moves 
down the pitch continuum. Since Harris's 
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range included three half-steps, a cps range 
of three half-steps from 390 cps to 440 eps 
was covered. The standard systematically 
began at the lowest extreme, increased five 
cps on each comparison until the upper limit 
was reached, then descended by five cps 
steps until the lower limit was again reached, 
This cycle continued until all comparisons 
had been made. The comparison tones, both 
those above and below the standard, were 
arranged in random order and each interval 
was calculated from the standard with which 
it was paired. All intervals in this study were 
calculated according to the formula for the 
even-tempered scale, which is the tuning 
most generally used. 

Following Repina (1961a, 1961b), in ad- 
dition to the unison, the intervals that 
were investigated were the major sixth, the 
perfect fourth, the minor third, a half-step, 
two-thirds of a half-step, and one-third of a 
half-step. Each of the intervals was pre- 
sented four times above the standard and 
four times below the standard. The unison 
was presented 16 times so that "same" was 
the correct response one-third as many times 
as “different” was the correct response. 

Four orders of the comparisons were made 
so that each order contained every compari- 
son once. The comparisons were randomized 
within each order. After each order was 
presented Ss stood and stretched for about 
2 minutes. It took approximately 30 minutes 
to give the directions and complete the test. 

Each stimulus presentation began with 
25 seconds of white noise which informed 
the experimenter (E) that a new presenta- 
tion was beginning. Then, in order, came 4 
seconds of silence, during which E stated the 
number of the presentation; .75 seconds of 
the standard tone; 25 seconds of silence; 
.75 seconds of the comparison tone; 5 sec- 
onds of silence during which Ss marked their 
answers; and then .25 seconds of white noise 
that ended one presentation and signale 
the beginning of the next. The tape Te- 
corder was equipped with a foot pedal whic 
enabled E to stop the tape between P 
entations if necessary to replace bro ds 
pencils and to insure that all Ss were on e 
same question. 

In adi earlier experiment (Duell, 1965), & 
judged the second tone as higher, bes 
the same as the first tone in a pair. this 
graders performed at chance level os 
task, perhaps because they were con i" 
about the concepts higher and lower. In i 
study Ss judged the pair as either the eas 
or different. They were supplied with xr 
sheets, which contained for each presen 
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the alternatives D for different and S for 
same. General instructions were given and 
sample tones were sounded. Two examples 
(major sixth above and unison) were then 
presented with the correct answer announced 
before each, followed by three examples 
(unison, major sixth above, and perfect fifth 
above) with the correct answer given after 
each example. The Ss were warned that some 
of the differences would be very small. 

All Ss marked the examples correctly and 
all made some response to the majority of 
presentations. Observation suggested that all 
or almost all worked diligently at the task. 
There were few signs of inattention even at 
the end of the testing period. 


Resurs AND DISCUSSION 


Figure 1 traces the pitch discrimi- 
nation performance for each grade. 
The means for the unison for Grades 
1, 2, and 3 were 85%, 83%, and 96%. 
Analyses of variance indicated 
that there were significant differences 
(œ = .01 for all tests mentioned 
herein) among intervals and among 
grades. As can be seen in Figure 1, 


MEAN PER CENT CORRECT 
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TABLE 1 


CUMULATIVE PERCENTAGE or Ss WITH 
THRESHOLDS OF A GIVEN SIZE 


Grade 

Interval » 

First |Second| Third | All 

Sixth 94 94 | 100 96 

Fourth 91 91 | 100 94 
Sharpened sec- 

ond 78 | 8 | 95 | 86 

Half-step 59 66 80 68 

Two-thirds 

half-step 36 | 40 | 60 | 45 
One-third half- 

Step 6 12 20 13 


Note.—These percentages are based 
upon mean DLs obtained by combining 
the UDLs and LDLs. N for the first grade — 
49; for the second grade = 58; for the third 
grade — 61. Total N — 168. 


the percentage of correct discrimina- 
tions increased as the size of the in- 
terval increased and improved from 
Grade 1 to Grade 3. The Fs for or- 


6———— GRADE! 
O———O GRADE 2 
LI—— — GRADE 3 


Ma — 22 4 6 


INTERVALS 
Fic. 1. Discrimination as a function of the size of the interval. 
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der were all less than 1.00, indicating 
neither a practice effect nor a loss of 
efficiency due to fatigue as testing 
progressed, or perhaps these factors 
eanceled each other out. The Fs for 
the Order X Interval interactions 
were significant; however, graphs re- 
vealed no trends for any of the 
groups. Intervals were slightly but 
not significantly easier to discrimi- 
nate when the comparison tone was 
lower than the standard. 

The principal goal of this study 
was to determine the size of the pitch 
interval which children will dis- 
criminate in a classroom milieu. The 
DL was defined as the smallest in- 
terval on which three of the four dis- 
eriminations were correct and which 
was followed by an interval in which 
three out of four discriminations were 
also correct. Table 1 contains the 
cumulative percentage of children 
who obtained DLs of a given size. 
About 30% of the children tested 
were unable to discriminate differ- 
ences as large as a half-step and 4% 
were apparently unable to discrim- 
inate differences as large as a sixth. 
Adult laboratory studies have re- 
ported DLs of about 3 eps within the 
area of the pitch continuum under in- 
vestigation, while the smallest inter- 
val in this study is approximately 8 
cps and the half-step is approxi- 
mately 24 cps. 

It must be kept in mind that the 
DLs identified in a free-field situa- 
tion without earphones are un- 
doubtedly larger than the DLs that 
would be found in a psychophysical 
laboratory. On the other hand, the 
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procedures used in the present study 
do resemble the conditions under 
which school music programs are con- 
ducted. Whether results entailing 
“pure” tones can be generalized to 
situations where complex tones are 
used is a moot question, and whether 
data gathered using any one type of 
complex tone would be representative 
of various complex tones is unanswer- 
able at present. Nonetheless, until 
such questions are answered, the pres- 
ent study raises doubts about the 
value of primary school music pro- 
grams for some children, and sug- 
gests a role for pitch discrimination 
training if effective techniques can be 
devised. 
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TRANSFER OF SOLUTION RULES IN PROBLEM 
SOLVING 


FRANCIS J. DI VESTA ann RICHARD T. WALLS 
Pennsylvania State University 


The effects of 5 stimulus-response relationships between Tasks A and 
B on the transfer of solution rules in problem solving were investi- 
gated. Facilitation was observed where solution rules and their appli- 
cations were similar in the 2 tasks. Less facilitation occurred when 
the associations of solution rules with discriminative stimuli were 
interchanged in Task B. Least facilitation was found in the control 
groups who learned solution rules in Task A that were inappropriate 
for the solution of problems in Task B. 3 levels of complexity in the 
application of rules to the solution of problems were identified. 


‘The present experiment was con- 
trived to represent certain assump- 
tions about problem solving posited 
by Schulz (1960). In particular we 
were interested in his analysis of 
such tasks as the water jar and ana- 
gram problems. These ostensibly uni- 
tary tasks may be analyzed into two 
functionally separable parts closely 
resembling the transfer paradigm. 

Thus, in Rees and Israel's (1935) 
study, subjects (Ss) first solved a 
number of single-solution anagrams 
for words belonging to a given con- 
ceptual class (e.g., “nature,” or “eat- 
ing” words). The S learned to react 
to the generalized discriminative 
stimulus of “concept-class”—nature 
words or eating words. Then, in the 
Second task, Ss solved anagrams for 
Which the solution might be a word 
within the class represented in the 
first task or a word in any other 
category. Accordingly, there is the 
Possibility of solving the anagram 
with one of two response-solutions; 
but S continues to use the solution 
Previously associated with the dis- 
criminative stimulus, a given con- 
Cept-elass. The effect is one of nega- 
tive transfer. 

k A more detailed analysis is made 

Y Schulz (1960) of the water-jar 
[VAR (Luehins, 1942) in which 

e transfer processes parallel those 


found in the anagram problems. In 
the first part of the task (Task I) 
S is presented with six problems. Each 
problem contains a stable stimulus 
pattern of three water jars and a 
stable response-solution (e.g, B-A- 
2C) or solution rule. The set of water 
jars can be considered a discrimina- 
tive stimulus, that is, a cue for a 
specific solution rule, since three 
water jars are common to all prob- 
lems. As a result of problem varia- 
tion, but always with three water 
jars, the discriminative stimulus be- 
comes generalized and associated with 
a given solution rule. With repeated 
presentations of problems, in which 
the three water jars are always pres- 
ent, the discriminative stimulus gains 
increased associative strength with 
the solution rule. Thus, in Task I, 
the most consistent S-R association is 
that between “water jars” and the 
solution rule of B-A-2C. It is this 
association that gains the most habit- 
strength, in the classic example, 
through six repeated trials. 

In the second part of the task 
(Task II), S is presented with two 
or more additional problems, These, 
since they are also water-jar prob- 
lems, embody the same discriminative 
stimuli present in Task I. However, 
the correct response solution has been 
changed from B-A-2C to the simpler 
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A+C or A—O solutions. Accordingly, 
S tends to employ the B-A-2C solu- 
tion, the stronger habit resulting from 
association with the general water-jar 
stimuli (not the size of the jars) in 
the siz previous problems. As in the 
above analysis of the anagram prob- 
lem, the effect is one of negative 
transfer. 

It is apparent from this analysis 
that the water-jar and anagram tasks 
are not only comprised of two parts, 
but they also involve nonspecific or 
generalized discriminative stimuli that 
are associated with response solutions. 
These relationships clearly parallel 
those found in transfer paradigms. 
Whether positive or negative transfer 
is obtained depends, in part at least, 
on the relationship of the nonspecific 
discriminative stimulus (S) and solu- 
tion rule (R) in the initial task to the 
S-R relationships in the second task. 

In this context, the notions of 
"rigidity," "set," and the like, com- 
monly used to describe the effect ob- 
tained with the water-jar, anagram, 
functional fixedness, and “Umweg” 
problem-solving tasks may represent 
instances of negative transfer. An anal- 
ysis of the task into its component 
parts suggests that positive transfer, 
in addition to negative transfer ef- 
fects, may be observed in the situa- 
tion. If these tasks are viewed as 
being comprised of a single series of 
diserete problems there is the pos- 
sibility that one may be “blinded” 
not only to the existence of the origi- 
nal task, transfer-task components 
but also to the possible existence of an 
effective or functional nonspecific 
discriminative stimulus. Perhaps as a 
consequence of this neglect only one 
of a number of possible transfer 
paradigms was investigated in which 
the actual emphasis was on manipu- 
lations only of the solution rules or 


strategies—although ostensibly both 
stimuli and responses were changed. 
The analysis of transfer processes 
in verbal learning has demonstrated 
that many variations of such stimu- 
lus-response (S-R) relationships be- 
tween the Task I and Task II phases 
of learning are particularly influ- 
ential factors in determining the 
amount and direction of transfer. The 
A-B, paradigm, for example, has 
been extensively investigated and 
shown to produce massive amounts of 
interference (Twedt & Underwood, 
1959). In this paradigm, Ss learn 
several (A-B) pairs of words such 
as “house-young” and “certain-while” 
during Task I; then, in Task II, the 
combinations are re-paired to form 
(A-B,) combinations such as “house- 
while" and “certain-young.” Several 
additional paradigms can be em- 
ployed to examine transfer produced 
by other variations of S-R. combina- 
tions. These paradigms can be desig- 
nated by the S-R terms in Task II 
learning (where the terms in Task I 
are always designated as A-B) as 
follows: The facilitation paradigms 
may be of the form A-B or A-B’ 
where A and B are identical in both 
tasks and where B’ may be an as- 
sociate or otherwise similar to the B, 
or response term, used in Task I. 
Among the interference paradigms 
are those labeled C-D (both the 
stimulus and response terms in Task 
II are different from those in Task 1) ; 
A-D (stimulus terms are the same 1n 
the two tasks but responses are dif- 
ferent); C-B (stimulus terms differ 
in the two tasks and response terms 
are the same); and A-B, (stimulus 
and response terms are the same 1m 
both tasks but are re-paired in Task 
II). 
The foregoing paradigms have been 
described in detail by Underwood 
(1966), who also indicated that they 
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have been applied primarily in verbal 
learning because of the analytical 
flexibility of such tasks as paired- 
associate learning. However, the ef- 
fect of these relationships on prob- 
lem solving has not been extensively 
investigated (Di Vesta & Walls, 
1967). As Schulz (1960) concludes, 


It is...possible that if some of the com- 
plex problems of everyday life could be 
analyzed into their S-R components, it 
would be found that the relationships [just 
described] would [also] be characteristic of 
the most difficult problems. Thus, suppose 
we have trained our students to make a 
particular set of responses to a particular set 
of stimuli. Later, the student finds himself 
confronted with a problem which contains 
this set. of stimuli, However, for a variety 
of reasons (e.g. errors in original training, 
errors in original learning, new discoveries, 
ete.) the S-R pairings required for solution 
of this problem are different from those 
required. previously. How can we minimize 
the amount of negative transfer this student 
is likely to encounter [p. 75]? 


This orientation, expanded further in 
the discussion section of this paper, is 
the basis of the present study. 

In summary, the purpose of the 
Present experiment was to examine 
the relative effects of the several S- 
R relationships, described at a molar 
behavioral level, between the Task I 
and Task TI phases of anagram solu- 
tion. It was assumed that initial (Task 

experiences in problem solving 
Would facilitate or hinder perform- 
ance in new (Task II) problem- 
Solving situations in much the same 
Way as these effects are observed in 
paired-associate learning. However, 
the emphasis in the present study is 
on generalized transfer, that is, on 
transfer associated with manipula- 
tions of discriminative stimuli and 
Solution rules, rather than on the 
transfer of specific stimulus and re- 
5Ponse elements, respectively. Thus, it 
Was hypothesized that when experi- 
ences with the application of solution 
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rules in Task I are congruent with or 
similar to the relationships required 
for effective solution of problems in 
Task II, the solution of anagrams in 
Task II will be facilitated. Con- 
versely, where the relationships be- 
tween the application of the solution 
rule in Task I and Task II are 
changed, the solution of new ana- 
grams will be hindered in Task II. 


METHOD 


Subjects 


The Ss were 75 undergraduate volunteers 
from the introductory educational psychol- 
ogy course at Pennsylvania State Univer- 
sity. There were 27 males and 48 females. 
The Ss were assigned, by reference to a 
table of random digits, to one of the five 
experimental groups. Randomization was 
restricted to the set of experimental condi- 
tions and recycled at N + 1 treatment as- 
signments. Although some Ss had partici- 
pated in verbal learning experiments, only 
one had experience with anagram prob- 
lems in experimental settings. The data for 
that S was eliminated from the analysis. 


Design 

The overall design was patterned after 
that used by Jenkins, Foss, and Odom 
(1965). Each S participated in two tasks. 
Task II was the transfer task and was com- 
mon to all experimental conditions, hence, 
it shall be labeled A-B for convenience in 
presentation. In Task II S solved 16 eight- 
letter anagrams. Each anagram was pre- 
ceded by one of four discriminative stimuli, 
that is, by an asterisk, number symbol, am- 
persand, or quotation mark, In Task IS 
practiced solving eight-letter anagrams for 
nonsense words by associating one of several 
solutions with a given discriminative stim- 
ulus until each solution could be used with- 
out direct reference. As in Task II, a dis 
criminative stimulus always preceded the 
i letter anagram. i 
ng © Hong NOn of the relationship be- 
tween the discriminative stimulus and re- 
sponse solution in the two tasks were 
accomplished in Task I. The following para- 
digms, labeled to parallel those employed 
by Jenkins et al. (1965), comprised. the 
experimental conditions; associative facilita- 
tion (A-B’), warm-up (D-E), classical inter- 
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ference (A-E), associative interference (A- 
B’,), and re-paired control (A-B.). The 
overall design was thus represented by five 
interference-facilitation levels in which the 
assumed relationship between Task I and 
Task II is one of judged rather than of ob- 
served similarity or dissimilarity. 


Conditions 


Associative facilitation. In Task I (A-B’) 
this group learned four S (discriminative 
stimulus)-R (solution rule) pairs. The solu- 
tion rule was always in verbal form and the 
symbols were different from the number 
symbols used in Task II. Thus, the solutions 
in the two tasks were related but not iden- 
tical. Each pair was learned in conjunction 
with the actual solution of four different 
anagrams for nonsense words. An example 
was the stimulus 


-+-+-+-+ 
*EINOZPUH 


for which the printed rule was the verbal 
statement, “All the minus letters and then 
all the plus letters." The S's task was to 
learn to spell the nonsense word, “E-N-Z- 
U-I-O-P-H," before looking at the rule 
printed on a separate card. In Task II the 
stimuli were presented in the form 


12345678 
*DAINSCTE 


for which the solution was to be reported as 
the word “DISTANCE” and the rule ap- 
peared in numerical form as follows: '*1357 
2468.” The symbols or numbers above the 
letters were there to facilitate the employ- 
ment of the rule. Other rules used in Task I 
were “The last pair of letters backward, 
then the first pair of letters forward, then 
the third pair of letters backward, and, 
finally, the second pair of letters forward" 
(in Task II the order given on the rule eard 
was 87 12 65 34); “Each pair of letters in 
order, but the members of each pair are re- 
versed” (in Task II the order on the rule 
card was 21 43 65 87); and, “The last pair 
forward, and the remainder in reverse or- 
der” (in Task II the order was 78 654321). 
Thus, in the transfer task, Ss in the associa- 
tive facilitation group were required to 
employ another but similar version of & 
previously learned solution to a familiar 
situation. 

Warm-up control. This group first learned 
a set of S-R pairs (D-E) that was unrelated 
to those in the transfer task (A-B). A ques- 
tion mark, a slash, a colon, and dollar sign 
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were used as discriminative stimuli and the 
solutions were learned as formulas com- 
prised of number symbols (e.g. 876 12345). 
‘All stimuli and response-solutions in Task 
I were different from those in Task II. Thus, 
the transfer task for the warm-up control 
group required S to learn new solutions to 
new problem-solving situations. 

Classical interference. The Task I list 
(A-E) learned by this group consisted of 
the same discriminative stimuli used in the 
solution of anagrams in Task II but they 
were paired with the formulas used in Task 
I by the warm-up control. Thus, the dis- 
criminative stimuli in both Task I and Task 
TI for this group were identical but the re- 
sponse-solutions were completely different. 
In the transfer task S learned new solutions 
to a familiar problem-solving situation. 

Associative interference. Task I, S-R pairs 
(A-B',) learned by this group consisted of 
the discriminative stimuli and verbal rules 
used in the associative facilitation condi- 
tion except that the rules were randomly re- 
paired with other stimuli. Thus, in Task II 
this group had to make new associations be- 
tween learned solutions and familiar prob- 
lem-solving situations. In addition, they 
were required to translate & verbal rule to 
a numerical rule to arrive at the solution. 

Re-paired control. In Task I this group 
learned to pair the same discriminative stim- 
uli and formulas required to solve the ana- 
grams in Task II except that the pairing 
was rearranged. The experimental arrange- 
ment for this group (A-Br) is described 
further in the introduction. As with the as- 
sociative interference group, it was unneces- 
sary for Ss to proceed through the transla- 
tion step since the response-solutions were 
identically the same in both Task I and 
Task II. 


Procedure 


Each anagram was typed in capital let- 
ters in the center of 3 X 5 inch plain index 
cards with two spaces between each pair 9 
letters. The cards were spiral-bound in book- 
lets. The cards with solution rules were 
bound in separate booklets. d 

In Task I the booklets were placed in 
front of S on a table at which S and the 
experimenter were seated opposite each 
other. The S was instructed according 10 

m instructions adapted from Ronning 
(1965). The overall procedure was, other- 
wise, similar to the anticipation method in 
paired-associate learning except that the 
trials were self-paced by S. 


| 


TRANSFER or SOLUTION RULES IN PROBLEM SOLVING 


Briefly, S was presented an eight-letter 
anagram and was asked to spell the required 
nonsense word. In the first block of trials 
for Task I S was shown an anagram simul- 
taneously with the corresponding rule for 
the Facilitation and Associative Interfer- 
ence groups but in numerical form for the 
Warm-up, Classical Interference, and Re- 
paired Control groups. The rule indicated 
how the letters of the anagram were to be 
reordered. The S's task was to call off the 
letters in the required order. He was given 
practice on a sample card until the ana- 
gram could be solved without the use of 
the rule card. Following the familiarization 
period he proceeded with the Task I train- 
ing-test series. As he went through the 
training (Task I) deck he was instructed 
that he should try to recall the rule asso- 
ciated with the symbol (ie. the discrimi- 
native stimulus). After eight training trials, 
two anagram problems for each of four dif- 
ferent S-R combinations, S was tested on 
Íour new nonsense-word problems without 
knowledge of results, as a criterion for Task 
I. If he was unable to read off the four 
test words without error, the training series 
was again repeated. He was then tested on 
another, different, set of four test problems, 
The procedure was repeated until S reached 
the criterion of four perfect solutions when 
unscrambling the anagrams for the nonsense 
words in Task I. 

. A list composed of 16 eight-letter Eng- 
lish nouns had been chosen for Task II 
from the Thorndike-Lorge (1944) word 
book. The procedure was to select a num- 
ber at random between 1 and 208, the num- 
ber of suitable pages in the Thorndike-Lorge 
norms, Beginning at the top of that page, 
that, and following pages if necessary, were 
Perused until an eight-letter noun was 
found, When a word was found, the pro- 
cedure was repeated. The nouns selected in 
this manner and used for anagrams in Task 
I Were the following: distance, pleasure, 
anything, minister, evidence, stranger, ad- 
dition, thousand, boundary, campaign, strat- 
. $8Y, shoulder, business, position, division, 

inguage, 

Task II was begun immediately following 
Task I with no break except to instruct S 
that he should now say the “eight-letter 
English word” to be deciphered from each 
of the new anagrams. The S was permitted 

Use pencil and paper if he thought it 
Would help him. However, he was not 
Permitted to write the rules. Time to solu- 
tion Was recorded by the experimenter with 
^ maximum of 120 seconds allowed for each 
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anagram. If S had not reached solution 
within the allotted time, the rule was ex- 
posed for him to use. As in Task I, the ap- 
propriate rule was exposed for 5 seconds fol- 
lowing a correct solution. The times to solu- 
tion for the 16 nouns were blocked into 
four trials of four words each to facilitate 
the analysis of the data, 


RESULTS 


The mean number of trials to cri- 
terion for all groups on the initial 
task (Task I) was 3.5. The range of 
means for the treatment groups was 
3.0-43. An analysis of variance of 
the data for the five groups indicated 
no significant differences among the 
means (F = 1.99, df = 4/70, p > 
.05). 

The main transfer results for the 
treatment groups’ performances on 
Task II are displayed in Figure 1. In 
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BLOCKS OF FOUR TRIALS 
Fra. 1. Mean number of seconds taken by 
the five groups to solve the anagrams in 


Task II. 


324 


that figure are summarized the mean 
number of seconds taken by each 
group to solve each of the four blocks 
of four anagrams. 

A mixed analysis of variance with 
five levels of the treatment variable 
(between) and four levels of trials 
(within) was performed on these 
data. The effect due to experimental 
treatments was significant (F = 
21.16, df = 4/70, p < .01). In addi- 
tion, the analysis yielded significant 
effects due to blocks of trials (F = 
80.39, df = 3/210, p < .01), and to 
the interaction between treatments 
and blocks of trials (F = 3.05, df 
= 12/210, p < .01). 

An analysis of differences in trends 
for the significant interaction term 
was made by the method of linear 
contrasts in which Scheffé’s test was 
used. The interaction was found to be 
due solely to the differences between 
the slope of the curve for the Facilita- 
tion group and the slope of the curves 
for each of the other groups. All of 
these differences were significant (p < 
05). None of the other comparisons 
was significant (p > .05). It can be 
noted in Figure 1 that the curves for 
all but the Facilitation group are es- 
sentially parallel. This observation was 
confirmed by the results of the anal- 
ysis. 

The Neuman-Keuls procedure for 


TABLE 1 
DIFFERENCES BETWEEN GROUPS IN MEAN 
NUMBER or SECONDS TAKEN To SOLVE 
ANAGRAMS ON EacH BLOCK OF 
TRIALS 


Re-_ | Associ- | Classi- 


tive cal | Warm- 
paired jnter- inter- u 
control | ference | ference) °” 
Associative facilita- 
tion 21.10* | 21.70* | 43.46* | 45.02* 
Re-paired control Fc 0.60 | 22.53* | 23.92* 
Amo ciative inter- ans 
ference — 2 .82* 
Classical interference x 1.39 


*p < .01, Neuman-Keuls test. 
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making a posteriori tests was used to 
test the differences between experi- 
mental group means across all of the 
groups. The results of these analyses 
are presented in Table 1. There it 
may be seen that the performance of 
the Re-paired Control group did not 
differ significantly (p > .05) from 
that of the Associative Interference 
group. Neither did the performance of 
the Classical Interference group dif- 
fer significantly (p > .05) from 
that of the Warm-up group. All other 
comparisons among treatments were 
significant (p < .01). 


Discussion 


The evidence clearly supports the 
hypothesis that Task I experience 
with application of a solution rule 
transfers to benefit or hinder per- 
formance in a later problem-solving 
task. An interpretive analysis of these 
data permits the identification of 
three levels of interference-facilita- 
tion associated with transfer of solu- 
tion rules in problem solving. 

At the first level it is to be ob- 
served that the Facilitation group 
solved the anagrams in Task II more 
rapidly than did the remaining 
groups. In the first phase they had 
learned one form of the solution 
rules (response integration) and the 
stimuli to which these rules applied 
(hook-up or association). Only the 
translation to new forms of the rules 
had to be learned in the second task, 
It is perhaps all too obvious that as- 
sociations of considerable strength 
must be assumed to have existed pre- 
experimentally between the verba 
tule used in Task I and the number 
rule used in Task II by this group. 
However, to argue that variables 
other than mediational ng Te 
were the main factors influencing the 
more rapid learning by this group 
compared to the other groups wou 
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require an explanation in terms of 
general transfer principles. The var- 
iables that would necessarily account 
for transfer in such an analysis were 
controlled in other parts of the de- 
sign. (See Jenkins, et al, 1965, for 
an earlier description on which the 
present explanation is based.) It 
should be noted that the Re-paired 
Control group had even more direct 
experience than did the Facilitation 
group with the specific stimuli and 
solution rules used in Task II, the 
exception being that the stimuli and 
solution rules were arranged differ- 
ently in the two tasks. Accordingly, 
facilitation cannot be explained only 
in terms of learning experiences in the 
current experimental situation. Neither 
can the facilitation be attributed to a 
lack of equated interference possi- 
bilities or to nonspecific transfer. These 
factors have been controlled in the 
Warm-up and Classical Interference 
groups, both of which took longer to 
solve the anagrams in Task II than did 
the Facilitation group. 

The Re-paired Control and the 
Associative Interference groups are 
at the second level of performance. 
These two groups had also achieved 
Tesponse integration through experi- 
ence in the first task with the identi- 
cal or similar forms of the rules to be 
used in the second task. However, the 
interchange of the pairing of these 
solution rules with discriminative 
stimuli imposed an additional learn- 
Ing requirement. Unlike the Facilita- 
lion group, the performance of these 
two groups was impaired by the re- 
quirement that new applications (or 
Associations) had to be made. Al- 
though some interference resulted 
from these conditions, it does not 
appear to be as great, relatively speak- 
Ing, as that observed in paired-asso- 
ciate learning when similar compari- 
Sons are made of the Facilitation 
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paradigm with either the Re-paired 
Control or Associative Interference 
paradigms. 

Finally, at the third level of per- 
formance are to be found the Warm- 
up and Classical Interference groups. 
In these groups both the discrimina- 
tive stimuli and solution rules, or 
only the solution rules, respectively, 
were changed from Task I to the 
transfer task. Thus, neither the solu- 
tion rules or their applications, as 
learned in the first task, were appli- 
cable to solving the anagrams in the 
second task. It is this paradigm that 
has been employed most extensively 
in the water-jar and similar problem- 
solving tasks. The transfer task in 
these two conditions makes greater 
demands on S than do the other con- 
ditions. The S must integrate the 
new solution rule, link it with a new 
stimulus, and, possibly, learn to em- 
ploy it to arrive at a correct solution. 
These last two groups may have ben- 
efited from experience in the first 
task as a result of nonspecific trans- 
fer (learning-to-learn) or they may 
have been hindered by interference 
from rules learned in the first task. 
However, the effects of these two 
factors cannot be evaluated in the 
present experiment due to the lack of 
an absolute control group. 

The three levels described above 
differ in the degree of complexity of 
the learning process required in dif- 
ferent stages of problem solving. Each 
level makes increasing demands on 
S, but the demands at one level are 
present at the next. This interpreta- 
tive possibility is suggested also in 
the performance curves where it ap- 
pears that the three major groupings 
begin at different points along a theo- 
retical overall learning curve. 

In general, the effects reported here 
on problem solving roughly parallel 
those found with verbal learning 


326 


tasks, Differences in results between 
the two learning situations are to be 
Observed and to be expected because 
of differences in the processes required 
in the two kinds of tasks. In addition, 
the values of the variables in the pres- 
ent experiment could only be relative 
and thus differ markedly in value 
from those typically employed in 
paired-associate learning (Mandler & 
Heinemann, 1956). 
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TWO METHODS FOR ADAPTING SELF-INSTRUCTIONAL 
MATERIALS TO INDIVIDUAL DIFFERENCES! 


RALPH J. MELARAGNO 
System Development Corporation, Santa Monica, California 


2 methods for adapting self-instruction to individual differences 
(Branching and Prediction) were compared with each other and 
with a nonadaptive method (Linear). Branching used learner per- 
formance on the task for modifying the sequencing of instruction; ` 
Prediction used scores on 5 pretests for varying the sequence. Branch- 
ing and Prediction procedures 1st were determined empirically, then: 
the 3 methods were experimentally assessed in a computer-based 
laboratory. Bivariate analysis of training time and posttest score 
showed a significant difference between Branching and Linear groups. 
Univariate analysis of posttest score yielded no significant, differ- 
ences; univariate analysis of training time showed Branching superior 
to Prediction, and Prediction superior to Linear. Results support the 
use of procedures for adapting instructional materials to individual 


differences. 


A continuing problem in the appli- 
cation of human learning research to 
instructional procedures is that of 
adapting to individual differences 
among learners. While it is recog- 
nized in the fields of education and 
training that individual differences in 
the learning process exist, little has 
been done about individuality be- 
cause of the dearth of factual infor- 
mation derived from experimentation 
(Eckstrand, 1962). 

One approach to human learning, 
Programmed , instruction (PI), has 
paid at least nominal attention to 
individual differences. Early forms of 
PI provided for individuality only in 
the learners’ rate of progress through 
the instruction; a basic assumption 
was that the program was appropriate 
for all learners, and the only varia- 
tion in performance would be in terms 


‘This paper is based on a doctoral dis- 
Sertation presented to the Graduate School, 
University of Southern California. The re- 
Search was conducted under the auspices 
of Contract N00014-66-C0081 between the 
Se of Naval Research, Washington, 
tio C, and System Development Corpora- 
«Qn, Santa Monica, California, Reproduc- 
lon in whole or in part is permitted for any 
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of the rate variable. This form of PI 
has become known as "linear"; that 
is, all learners progress through a 
common program in a straight-line 
fashion. 

A more recent form of PI has 
evolved through the introduction of 
the digital computer into the physi- 
cal makeup of the teaching machine. 
Computer-based instruction uses a 
form of PI called “branching,” per- 
mitting learners to branch at frequent 
times during training to the instruc- 
tion supposed to be most appropriate 
to their individual needs. The basic 
branching system rests upon the com- 
puter’s capability to keep track of 
learner performance data and to se- 
lect the next sequence of instruction 
on the basis of prior performance. 
Thus, branching PI can accommodate 
individuality in amount of instruc- 
tion and kind of instruction, as well 
as in rate of progress. 

Recently, there have been proposals 
to use the preinstructional character- 
istics of the learner as criteria for the 
selection of appropriate types of in- 
struction. Saltzman (1964) and Sto- 
lurow and Davis (1965) have indi- 
cated that entry behavior of the 
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learner be input to the machine and 
used for specifying the instruction 
appropriate for each learner. 

Two distinct procedures, then, have 
been proposed for the modification of 
instruction as a function of the in- 
dividual differences among learners. 
In one ease, adaptation of a basic 
sequence is made on the basis of 
learner performance on prior instruc- 
tion; in the second, the instructional 
sequence for any learner is determined 
by his unique preinstructional behav- 
ior. The present study was addressed 
to two questions: (a) Which of these 
procedures is more effective in adapt- 
ing PI to individual learner dif- 
ferences? (b) Is either of them more 
effective than linear PI? 


Mersop 


Three procedures for presenting self-in- 
structional materials were used in the study. 
The first, Branching, used data from learner 
performance on the instructional materials 
as a basis for adjusting subsequent instruc- 
tion to individual differences. The second, 
Prediction, used data from learners’ entry 
behavior for predicting the best sequences 
of instructional materials for each learner 
individually. The third, Linear, presented 
an identical instructional sequence to all 
learners. 

The study was conducted in two phases. 
During the first phase, empirical trials of 
a self-instructional program in geometric 
inequalities were conducted with small 
groups of learners in order to determine 
strategies to be employed for the Branching 
and Prediction conditions. In the second 
phase, „the relative effectiveness of the 
Branching, Prediction, and Linear conditions 
was experimentally investigated in a com- 
puter-based instructional laboratory. 


Phase I 


Subjects. "Thirty-two volunteer subjects 
(Ss) were obtained from two high schools 
and were paid for their participation. The Ss 
were recruited from first-semester geometry 
classes, since students in these classes were 
naive concerning the content of the program 
yet possessed the mathematical prereq- 
uisites, , 
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Materials and apparatus. The program in 
geometric inequalities was prepared for and 
used in earlier investigations (Coulson, 1964; 
Coulson, Melaragno, & Silberman, 1965). It 
was made up of six units: The first five 
treated axioms, theorems, and postulates of 
inequalities and the sixth contained in- 
struction on methods for preparing to prove 
theorems, At the end of each unit, Ss were 
tested briefly on the material in that unit, 
and then given an additional set of materials 
devoted to discussions of questions in the 
quiz. 

The program was written in multiple- 
choice format. Each step in the program was 
presented on a separate page. Many of these 
steps called for no response but were only 
to be read. For the Phase I trials, the cor- 
rect answer to each question was written on 
the back of the page containing the ques- 
tion. 

Measuring instruments. 'The posttest was 
composed of 33 questions and, because of 
multiple answers, had a total possible score 
of 57; 29 points were possible for tasks di- 
rectly related to the content of the program, 
and 28 points were possible for points related 
to transfer situations. 

In order to obtain a wide sampling of 
relevant entry behavior, seven tests were 
selected as potential pretests. One test 
measured achievement in fundamentals of 
geometry; the other six were found by using 
Guilford’s Structure of Intellect model 
(Guilford & Merrifield, 1960) as a means 
for determining intellectual abilities that 
should bear a relationship to the learning 
of geometry. The six Guilford tests used 
were: Problem Solving, Symbolic Grouping, 
Match Problems, Operations Sequence, Hid- 
den Figures, and Symbolic Reasoning. 

Method. The 32 Ss were first administered 
the seven pretests, then given the inequal- 
ities program. The first 12 Ss received the 
program individually, the next 10 in pairs, 
and the final 10 in two groups of five Ss 
each. The experimenter observed each S's 
progress through the program, responded to 
questions from Ss, and kept detailed records 
of kinds and specific locations of unusual 
performances to determine indications © 
the need for branching. Two distinct types 
of branching were determined: locations m 
the program where Ss could afford to skip 
past redundant instruction and locations 
where performance indicated the necessity 
for additional remedial instruction. d 

The branching structure that evolve 
from these empirical trials had the follow- 
ing characteristics: nine points in the 
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program where satisfactory performance al- 
lowed S to branch past redundant instruc- 
tion and 14 locations where poor perform- 
ance on quiz questions caused S to be 
branched to remedial instruction. The final 
version of the program contained 248 items 
in the ‘main portion, 20 items devoted to 
end-of-unit quizzes and feedback to the 
quizzes, and 38 remedial items. 

Results, After the empirical trials, the 
seven pretests were scored and related to Ss’ 
performance on the program. The technique 
of multiple cutoff scores was found to be 
most effective in determining which Ss 
should branch at each of the 23 decision 
locations in the program. Two pretests 
showing no relationship to performance were 
eliminated: the Symbolic Reasoning and 
Match Problems tests. 


Phase II 


Subjects. To investigate experimentally 
the two questions of the study, 44 Ss were 
recruited from a different high school from 
those that supplied Ss for Phase I. The Ss 
were obtained from seven classes in the 
second semester of geometry, and were paid 
for their participation, In order to assure 
naiveté with respect to geometric inequal- 
ities, teachers of the seven classes postponed 
any treatment of inequalities until after 
the experiment was completed. 

Treatment conditions. The Ss were ran- 
domly assigned to three treatment condi- 
tions: Linear (N = 14); Prediction (N = 
15); and Branching (N = 15). The design 
of the experiment was a one-way, random 
effects, multiple analysis of variance, with 

e three conditions as independent. vari- 
ables, and with scores on the posttest and 
training times as dependent variables. 

Materials and apparatus. The program in 
Seometric inequalities was used as the ex- 
perimental vehicle. It was contained in 
looseleaf notebooks, each item on a separate 
page. Following each of the six units in the 
Program, Ss received the quiz on the topics 
included in the unit and evaluated the ade- 
quacy of their answers. 

The experiment was conducted in 
CLASS, a 20-station computer-based in- 
structional laboratory at System Develop- 
Fun] Corporation in Santa Monica, Cali- 
Omnia. Each station was provided with a 
notebook containing the program, a device 
Or the S's responses to questions, scratch 
Paper, and a pencil. To control for novelty 
of the environment, all three treatments 
Were administered in CLASS. 
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Method. Prior to the study, Ss were 
administered the pretests in group settings. 
The Ss were brought to the laboratory daily 
by chartered bus at the end of normal school 
days for experimental runs of 50-75 minutes. 

The Linear condition was run first, then 
the Branching, and finally the Prediction 
condition. The first time a particular treat- 
ment group came to the laboratory, Ss were 
read a standard set of directions explain- 
ing the operation of the equipment and pro- 
cedures to be followed. Each S progressed 
at his own rate. The experimenter answered 
questions about the equipment, but not 
about the content of the program. When an 
S completed the program he remained at 
his station, but was not allowed to look over 
the program. 

The posttest was administered to all Ss 
in each treatment condition, as a group. 
It was given the day following the comple- 
tion of the program by all Ss in that group. 

For the Linear condition, all 14 Ss re- 
ceived the same 285 items from the pro- 
gram. This included 248 main instruction 
items, 20 items of quizzes and feedback on 
quizzes, and 17 remedial instruction items, 

The five pretests for each of the 15 Ss 
in the Prediction condition were prestored 
in the computer. At each of 23 decision 
points in the inequalities program, the com- 
puter examined these scores and directed 
each S to the next item that had been pre- 
dicted as appropriate for him. 

At each decision point in the program for 
the Branching condition, the computer ex- 
amined certain prior performances of each 
of the 15 Ss in that group and determined 
whether or not to branch the S. Most 
branching decisions were made as a function 
of Ss' evaluations of their performance on 
end-of-unit quizzes. (Evaluations of quiz 
performances were ignored in the Linear and 
Prediction conditions.) 


TABLE 1 
PosmrEsT SCORES AND TRAINING TIMES 


Posttest scores 
M 37.4 | 37.3) 41.1 
SD 9.5| 6.5] 5.8 
Training times 
(in minutes) 
M 195.1 | 168.6 | 147.1 
SD 27.8 | 28.7 | 38.6 
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TABLE 2 


VaLuEs or F ror Comparisons Or PAIRS 
or BIVARIATE ÜENTROIDS 


P | af 
Linear vs. Prediction 3.057 | 2, 26 | .10 > p > 05 
Linear vs. Branchii 10.593 | 2, 26 p<.001 
Prediction vs. Branch- 
ing 3.263 | 2,27 | .10 > p > .05 


Results. The means and standard devia- 
tions for both dependent variables, by 
groups, are presented in Table 1. Box’s test 
of homogeneity of bivariate dispersions 
(Cooley & Lohnes, 1962) yielded an F of 
622, which was not significant. The multi- 
variate analysis of variance was computed 
using the technique presented by Rao 
(1952); this yielded an P of 11.571 which, 
for 4 and 82 degrees of freedom, is significant 
beyond the .0005 level of confidence. 

Since the hypothesis of homogeneity of 
dispersions could not be rejected and since 
the hypothesis of random sampling from a 
common population was rejected (by the 
multivariate analysis of variance), tests of 
the experimental questions were performed. 
The experimental questions were tested us- 
ing Hotelling’s 7?-test for bivariate cen- 
troids which is an extension of Student's 
t-ratio for univariate samples (Rao, 1952). 
T* is evaluated for significance by trans- 
forming it to F; the results are presented 
in Table 2. 

In order to gain further information 
about the effects of the treatment conditions, 
the two dependent variables were also sub- 
jected to univariate analyses. For posttest 
scores, F= 1301; df = 2/41; and p > .10. 
For training times, F = 10341; df = 2/41; 
and p < .005. Since a significant difference 
was found among the three training times, 
t-tests were applied to evaluate differences 
between pairs of treatment means, The re- 
sults are shown in Table 3. 


TABLE 3 


VALUES OF i FOR Comparisons OF Mean 
TRAINING Times 
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Discussion 


Inspection of the means in Table 1 
shows a relatively clear ordering of 
the three treatment conditions, The 
Linear condition is the least effec- 
tive; the Branching condition the 
most effective; and the Prediction 
condition falls in between. The sig- 
nificant difference in bivariate cen- 
troids between the Branching and 
Linear conditions implies that an 
adaptation procedure based upon 
S performance on the learning task is 
a useful strategy to follow. However, | 
since the Prediction condition does 
not differ significantly from either 
the Branching or Linear conditions, 
it is not clear whether adaptation to 
individual differences is necessary 
and, if it is, what approach is to be 
preferred. 

Some clarification of this situation 
is afforded by the results of the anal- 
ysis of training times. It appears that 
when a self-instructional program 
produces essentially equivalent learn- 
ing among Ss regardless of individual 
differences, significant savings in 
training time can be achieved by at- 
tending to such individuality in learn- 
ing. In addition, it seems that when 
individual differences are to be con- 
sidered, a method using performance 
on the learning task (branching) is 
preferable to a method based on pre- 
training assessment (prediction). 4 

These conclusions, and their impli- 
cations for adapting to individual dif- 
ferences, must be evaluated in light 
of the restrictions in this study. Ge- 
ometry is a unique subject matter be- 
cause of the interrelationships among 
topics and the orderly development 
of instruction. Furthermore, geometry 
students are a restricted sample of 8 
high school population: Geometry 18 
an elective subject, chosen by stu- 
dents who have had previous success 
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in mathematics. It may be that the 
conclusions presented are limited to 
highly organized subject-matter areas 
and to learners who have achieved 
success in prior, related areas. 
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TEACHER VERBAL CUES AND PUPIL PERFORMANCE 
ON A GROUP READING TEST' 


GEORGE 8. LAMB 
Western Washington State College 


Verbal cues encouraging (a) rapid work, (b) accurate work, or (c) no 


specified cues were administered 


by 36 female teachers to their 2nd- 


and 3rd-grade classes during a group reading test. Effects of the cues 
were evaluated in terms of the class performances on the tests. Data 
were treated by analysis of variance and Scheffé confidence intervals 
at the 01 and .05 levels of significance respectively. Independent 
sources of variation were (a) treatments, (b) grade levels, (c) sex, 
and (d) reading achievement levels. Results indicated that girls were 
more responsive to teachers’ verbal cues than boys. 


| What effects do teachers’ verbal 
cues of encouragement or warning 
have upon the behavior of pupils? 
"Teachers constantly offer a wide va- 
riety of verbal cues in their class- 
rooms, which are intended to modify 
the behavior of pupils. The assump- 
tion that these cues really do affect 
pupil behavior is generally taken for 
granted. Yet this assumption has 
rarely been tested experimentally un- 
der ordinary classroom conditions. 

Most of the experiments of verbal 
cues and reinforcements have been 
laboratory studies. They have gen- 
erally involved an experimenter work- 
ing with one subject at a time and 
have measured performance on simple 
physical tasks (Stevenson, 1965). The 
results of these studies have limited 
application to group instruction in 
the classroom (Jackson, 1965). 

Classroom experiments in this area 
have investigated reinforcements, ei- 
ther oral (e. g., Auble & Mech, 1953; 
Hurlock, 1925; Van De Riet, 1964) 
or written (e. g., Anderson, White, & 
Wash, 1966; Forlano & Axelrod, 
1937; Wallen, 1964). With the ex- 

*This study is based on a dissertation 
submitted to the Graduate School of the 
University of Minnesota in partial fulfill- 
ment of the requirements for a doctoral de- 
gree. The author is indebted to Theodore 
Clymer for his valuable assistance and ad- 
vice in conducting this study. 


ception of the study by Page (1958) 
of teachers’ written comments, these 
studies have dealt with pupils inde- 
pendent of one another instead of 
pupils as integral parts of a classroom 
with its own group interactions. 

The present experiment was an in- 
vestigation of verbal cues adminis- 
tered during the time that pupils were 
working on the tasks, rather than of 
verbal reinforcements administered at 
the conclusion of the tasks. Specifi- 
cally, the study sought to answer the 
following question: Are the responses 
of second- and third-grade pupils on 
two subtests of the New Develop- 
mental Reading Tests (Bond, Ba- 
low, & Hoyt, 1963a) affected by the 
teacher’s verbal cues to work rapidly 
or to work accurately? The treat- 
ment cues were administered to class- 
room groups rather than to individual 
pupils. : 

One widely accepted principle of 
learning is that cues which arouse 
motivation toward the achievement 
of an educational objective will in- 
crease the effectiveness with whic 
that objective is achieved (Wallen & 
Travers, 1963). If the treatment cues 
have a straightforward effect, We 
would expect that classes receiving 
speed cues would attempt more tas 
items than classes receiving accuracy 
cues or no cues. We would also €x 
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pect that classes receiving accuracy 
cues would achieve a higher percent- 
age of items correct than classes re- 
ceiving speed, cues er no cues, 


Mernop 


Subjects 


The sample was 18 second-grade classes 
and 18 third-grade classes to which female 
student teachers had been previously as- 
signed. The classes were located in central 
area and suburban schools of metropolitan 
Minneapolis and St. Paul. 


Variables 


The tasks were two subtests of the New 
Developmental Reading Tests, a group 
reading achievement test for second- and 
third-grade pupils. In the first subtest the 
pupils are to mark the one word among four 
which identifies a picture. In the second 
subtest the pupils are to mark a set of pic- 
tures according to specific written direc- 
tions. Pupils attempt as many items as they 
can within the time limits of the test. For 
this study, it was important that the dis- 
tributions of scores should not be com- 
Pressed because pupils ran out of task items 
during the testing. The time limits for the 
subtests were, therefore, reduced from 10 to 
4 minutes and from 15 to 9 minutes, respec- 
tively. The test yielded measures of seven 
dependent variables: (a) the number of 
items correct on Subtest I, (b) the number 
of items correct on Subtest II, (c) the ad- 
justed score for guessing on Subtest P, (d) 
the number of items attempted on Subtest 
I, (e) the number of items attempted on 
Subtest IT, (f) the ratio of the number of 
items correct to the number of items at- 
tempted on Subtest I, and (g) the ratio of 
the number of items correct to the number 
of items attempted on Subtest II. 


Procedures 


The classes were randomly assigned to 
One of three treatments: (a) speed, (b) ac- 
eof and (c) control. The female student 
eachers were the experimenters and re- 
ae Prior training on the administration 
of the tests and the treatments. The treat- 
ments were administered while the classes 
m 


P "The adjusted score is the number cor- 
«et minus % the number wrong, and is the 
andard method of scoring the first subtest. 
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Subtest I. Word Recognition 


Cue 


Begin. Let’s see how quickly you 
can work today. 


Time 
0 min. 


lmin. You people are doing very well. 

Isee that you are getting a lot 
p done. 

2min. My, you people are fast workers. 

3min. It looks to me as if many of you 
will get most of your work 
finished. 

4min. Stop. 


Subtest II. Comprehending Specific 


Instructions 
Time Cue 
0min. Begin. Let's see how quickly you 
can work today. 
lmin. You people are doing very well. 
Isee that you are getting a lot 
done. 
2min. My, you people are fast workers. 
3min. It looks to me as if many of you 
will get most of your work 
finished. 
5min. I am pleased to see how quickly 
everyone is doing his work. 
7 min. Iam happy to see that you are 
getting so much done. 
9 min. Stop. 


Fie. 1. Schedule of speed cues. 


were taking the tests. For the speed treat- 
ment, the experimenters gave verbal clues 
encouraging rapid work. For the accuracy 
treatment, they gave verbal cues encour- 
aging accurate work. For the control treat- 
ment, no verbal cues were given. The con- 
tent and timing of the cues were specified 
by a script (see Figures 1 and 2). The tim- 
ing of the tests and the cues was by stop 
watch. 


Analyses 


Since it was possible that the treatments 
would have different effects in the two 
grade levels or on boys and girls, the 
analysis compared the group means in these 
categories and their combinations as well 
as comparing the general treatment effects. 
Another possibility was that the treatments 
would have different effects upon groups of 
pupils of differing reading ability. In order 
to examine this latter effect, the third sub- 
test of the New Developmental Reading 
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Subtest I. Word Recognition 
Time Cue 

O0 min. Begin. Let's see how carefully 
you can work today. 

lmin. You people are doing very well. 
Isee many perfect papers. 

2min. My, you people are careful work- 
ers. 

3 min. It looks to me as if many of you 
will get most of your work 
right. 

4min. Stop. 

Subtest II. Comprehending Specific 
Instructions 
Time Cue 

0min. Begin. Let’s see how carefully 
you can work today. 

l min. You people are doing very well. 
Isee many perfect papers. 

2min. My, you people are careful work- 
ers. 

3 min. It looks to me as if many of you 
will get most of your work 
right. 

5 min. Iam pleased to see how carefully 
everyone is doing his work. 

7 min. Iam happy to see that you are 
doing this work correctly. 

9 min. Stop. 


Fia. 2. Schedule of accuracy cues. 


Tests (Bond, Balow, & Hoyt, 1963b), was 
administered to the classes prior to the ad- 
ministration of the treatments. The scores 
on this subtest were sorted into approxi- 
mately equal groups of “high” and “low” 
for second-grade boys, second-grade girls, 
third-grade boys, and third-grade girls. 
Thus, the seven scores on the treatment 
tasks were analyzed according to fuur in- 
dependent variables and their combinations: 
(a) treatment, (b) grade level, (c) sex, and 
(d) reading achievement level nested within 
grade by sex. The: pooled means for these 
four classifications and their combinations 
were examined for trends. Differences among 
.comparable means were tested by analysis 
of variance at the .01 level of significance, 
and, when appropriate, by Scheffé confi- 
dence intervals at the 05 level of signifi- 
cance. 


RESULTS 


The means of the variables which 
are normally used to score the tests, 
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TABLE 1 
ABSOLUTE DIFFERENCE AMONG MEANS oF 
Boys AND GIRLS BY TREATMENTS FOR 
THE NUMBER OF ITEMS ATTEMPTED 
on SunrzsT II 


Sex, treatment (BA) | (BO) | (GS) 


Note.—N = 12 classes for all groups. 
*»«.05. 


i. e., the number of items correct and 
the adjusted score for guessing, did 
not occur in any consistent pattern, 
and the differences were not statisti- 
cally significant. The mean scores for 
the variables involving the number of 
items attempted and the ratio of 
items correct to items attempted did 
occur in the orders predicted. The 
speed treatment group did attempt 
more items than the accuracy or con- 
trol groups, and the accuracy treat- 
ment group did achieve a higher ratio 
of items correct to items attempted 
than the speed or control groups. 


TABLE 2 
Pootep Means anp F Ratios or Boys 
AND GIRLS 
ó Pooled means 
Dependent variable F ratio 
Boys | Girls 
| x 
Number correct a 
ibtest 21.74 | 24.85 | 30. 
Subtest i 10.96 | 1204 | 39.87" 
Number correct minus one- 
third the number incor- 
, 
"Subtest I 18.99 | 22.18 | 34.9 
S aoan $ 29.41 | 30.02 15 
Subtest II 18.08 | 19.20 | 2 
ite Gabe cape” 
ie number at 97.50 
btest .78 .80 E 
Subtest it I| 168 | 18.85 


Note.—N = 36 classes. 
* p € .001, 1/30 df. 
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However, the differences among the 
means were not statistically signifi- 
cant. 

One interaction effect was statisti- 
cally significant. This was the Treat- 
ment X Sex interaction for the num- 
ber of items attempted on the second 
subtest. The differences among the 
pooled means associated with this 
interaction are in Table 1. An analy- 
sis of these differences using Scheffé 
confidence intervals showed that girls 
attempted significantly more items 
under speed cues than under accuracy 
cues, 

Girls consistently scored higher 
than boys on all seven variables, as 
shown in Table 2. The differences 
were statistically significant except 
on the two variables for the number 
of items attempted. 


CONCLUSIONS AND DISCUSSION 


The complete data (Lamb, 1965) 
are too numerous for this report, and 
must be interpreted with caution. It 
should be kept in mind that the total 
treatment period was 13 minutes and 
that the sample size was small. 

Within those limitations, i& would 
appear that the mean scores of two 
typical measures of achievement on a 
primary reading test are not signifi- 
cantly affected by consistent cues to 
Work rapidly or carefully. This sta- 
bility of result should be encouraging 
to the authors, publishers, and users 
of such tests, 

The results indicate that girls may 
be affected significantly by speed and 
Accuracy cues in the number of items 
they attempt. The fact that this ef- 
€ct was apparent on the second sub- 
test and not on the first might be 
because the treatments had been in ef- 
fect longer at the time of the second 
Subtest, or it might be due to the na- 
ture of the tasks in the second sub- 
test. Since boys were not affected in 


this way, there is the implication that 
girls are more responsive to a female 
teacher’s verbal cues than are boys, 

The clear superiority of girls over 
boys was less pronounced on the meas- 
ures of the number of items at- 
tempted. This is not easily explained. 
It may be that the dominant treat- 
ment on girls was the accuracy cues, 
and that these had the effect of reduc- 
ing the number of items they at- 
tempted. Another possibility is that 
the superior achievement of girls on 
reading tests is due more to the ac- 
curacy with which they deal with the 
items than with the number of items 
they attempt. 
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ATTENTIONAL PROCESS IN READING: THE EFFECT 


OF PICTURES ON THE ACQUISITION OF 
READING RESPONSES 


8. JAY SA 
University of Minnesota 


Pictures may be used as prompts when the reader cannot read a word 
in the text, but pictures may miscue and may divert attention from 
printed words. In Experiment I, 30 randomly assigned pre-1st graders 
learned to read 4 words with no picture, a simple picture, or a com- 
plex pieture present. During acquisition trials, when pictures were 
present, the simple and complex picture groups made more correct 
responses (p < 01). During test trials, with no pictures present, the 
no-picture group excelled (p < .01). In Experiment II, 26 matched 
pairs of 1st graders were given classroom reading instruction under a 
no-picture or picture condition. The results disclosed that poor 
readers with no picture present learned more words (p < 401). 
Among better readers the difference was not significant. The re- 


sults are discussed in terms of attentional processes. 


Psychologists have long been aware 
of the central role of attentional proc- 
esses in learning. Pavlov and his as- 
sociates found that in order to clas- 
sically condition animals distracting 
stimuli which competed for the ani- 
mal’s attention had to be eliminated. 
To accomplish this, they worked in 
a specially constructed soundproofed 
building with partitions separating 
experimenter and animal. The mere 
presence of the experimenter or the 
sound of footsteps in the experimental 
situation seriously prolonged the con- 
ditioning procedure (Osgood, p. 311, 
1953). Distracting background stimuli 
not only interfere with the learning 
of animals but with the learning of hu- 
mans as well. The ability of the indi- 
vidual to withhold attention selec- 
tively from irrelevant and distracting 
background stimulation seems to be 
implicated in reading disability, ac- 
cording to Santostefano, Rutledge, 

* The author would like to express his ap- 
Preciation to Joseph Jenkins for his help in 
data collection and data analysis for Ex- 
periment I, and to Edwin Myers for his 


help in data collection and data analysis in 
Experiment IT. 


and Randall (1965). It appears that 
when distracting stimuli are present 
the performance of the underachiever 
undergoes greater disruption than 
does the performance of the more 
capable student (Baker & Madell, 
1965; Silverman, Davids, & Andrews, 
1963). 

The purpose of the present study 
was to test the hypothesis that when 
pictures and words are presented to- 
gether, the pictures would function as 
distracting stimuli and interfere with 
the acquisition of reading responses. 
Pictures may be used as prompts 
when the reader cannot read a word 
in the text, but pictures may miscue 
and divert attention from the critical 
task of attending to the printed words. 

In order to test this hypothesis two 
experiments were conducted. Experi- 
ment I was designed as a laboratory 
study to test the effect of pictures on 
naive subjects (Ss) under conditions 
unlike those found in classrooms. Since 
findings derived from laboratory set- 
tings are often subject to the criti- 
cism that they could not be replicated 
in less artificial situations, Experi- 
ment II was designed to test the ef- 
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fect of pietures on less naive Ss using 
a procedure which was similar to that 
used in actual classrooms. 


EXPERIMENT Í 


Method 


Subjects. Thirty children who had kin- 
dergarten experience and who were en- 
rolled in a prefirst grade summer program 
were randomly assigned to one of three ex- 
perimental treatments. The Ss were pre- 
tested to assure that no one was able to read 
the words used in the experiment. 

Design. À simple, randomized design was 
used. Ten Ss were randomly assigned to the 
no-picture, 10 to the simple-picture, and 10 
to the complex-picture condition. 

Materials. The pretest materials consisted 
of four 5 X 8 inch index cards with either 
“boy,” “bed,” “man,” or “car” typed on 
them. There was one word typed on each 
card. 

For the warm-up trials, four novel, non- 
sense figures that had been created for a 
study on reading were used (Jeffrey & 
Samuels, in press). The figures were highly 
discriminable from each other and suffi- 
ciently dissimilar from letters in the Roman 
alphabet to make generalization from the 
other unlikely. The figures were placed on 
5 X 8 inch index cards, one to a card. The 
Ss had to learn to associate a number with 
Peg figure. The numbers were 1, 2, 3, and 


For the experiment proper, a primary 
typewriter was used to type the words, boy, 
bed, man, and car, at the bottom of 5 x 8 
inch index cards, one word to a card. The 
same words were presented in the no-picture, 
simple-picture, and complex-picture condi- 
tions. 

For learning trials in the no-picture con- 
dition, there was a word at the bottom of 
each card but no picture was present. 

For learning trials in the simple-pieture 
condition, the words were iyped at the 
bottom of each card as in the no-picture 
condition, Above each word was a simple 
black and white drawing portraying the ob- 
ject that the word symbolized. Previous 
work with the pictures indicated they could 
reliably elicit the same verbal response as 
was typed at the bottom of the card. 

For learning trials in the complex-pieture 
condition, the words were typed at the bot- 
tom as in the no-picture condition. Above 
each word was a colorful picture which had 
been cut out of a reading primer. The 
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pictures were complex in that they pic- 
torially represented more than the word 
which was at the bottom of the card. For 
example, the picture attached to the card 
with the word “boy” at the bottom showed 
a boy holding his dog. The boy was point- 
ing to a horse in the background. 

The cards used for the test trials were 
the same for all conditions. At the bottom 
of the test cards the four words, boy, bed, 
man, and car, were typed in lower case with 
the primary typewriter, one word to a card. 
There were no pictures on any of the test 
cards. 


Procedure 


The experimenter worked individually 
with the Ss during all phases of the proce- 
dure. A pretest was given to each S. The 
S was told, “Today, we are going to play 
& game. In this game we are going to learn 
some words. First, let us see if you already 
know what the names of the words are." 
The four words were shown to the S. If he 
was able to read any of the words, he was 
eliminated. 

Following the pretest a warm-up was 
given to each S to acquaint him with the 
nature of the learning task. The S was 
told, “Before we learn the new words, let 
us practice on some numbers. I will show 
you a card with a funny-looking number on 
it and I want you to tell me what the 
number is. If you don't know the number's 
name I will tell you what it is. You should 
try to tell me what the number is before I 
tell you. Do you understand what we are 
to do? All right? Then, what do you do 
when I show you a card with a number on 
it?" Each card was shown to S for an Ed 
proximate 4-second interval. At the en 
of the anticipation interval S was told the 
number. The cards were presented in three 
random orders. Each S was given six warm- 
up trials. \ 

When the warm-up was over, S was ET 
the first learning trial He was told, the 
right, now let us see how we can learn x 
new words. I will show you a card io 
word on it and I want you to tell me bei 
the word’s name is. If you don’t know 
word’s name I will tell you. You should a 
to tell me the name before I tell you. D 
you understand?” Each card was A tn 
the S for 4 seconds. Then he was told e 
name. The S was scored for a ported 
sponse if he said the appropriate word be 
feedback was given. 


Following the first acquisition trial, the 
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nrst test trial was given. The test card was 
presented for 4 seconds and S had to give 
the correct response during this interval. 
No feedback was given during test trials. 
Acquisition trials and test trials were alter- 
nated. Each S received 10 learning trials and 
10 test trials. The stimuli for all phases of 
the procedure were presented in three ran- 
dom orders. 


Results 


Separate analyses were computed 
for responses during acquisition trials 
and for responses during test trials. 
During acquisition trials pictures were 
present as incidental cues for the pic- 
ture conditions. During test trials 
pictures were not present. 

On the acquisition trials, as seen in 
Table 1, the mean number of correct 
responses given for the no-picture 
condition was 25.30; for the complex 
picture condition it was 36.90; and 
forthe simple-picture condition it was 
39.40. Comparing the simple-picture 
to the no-picture condition during ac- 
quisition, Ss in the simple-picture 
condition gave significantly more 
correct responses (¢ = 9.02, df = 18, 
P < .01). Comparing the complex- 
Picture to the no-picture condition, 
Ss in the complex-picture condition 
gave significantly more correct re- 
sponses (t = 7.42, df = 18,p < .01). 

On the test trials, where incidental 
Cues were absent for all conditions the 
no-picture group excelled. As seen in 
Table 2, the mean number of correct 
Tesponses given on the tests by Ss in 
the no-pieture condition was 19.20; for 
Ss in the simple-picture condition it 


TABLE 1 
Mrans Amp STANDARD DEVIATIONS FOR 
TREATMENTS on Acquisition TRIALS 


Treatment M SD 
No-pieture 25.30 7.28 
cunple-picture 39.40 1.26 

omplex-pieture 36.90 6.34 
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TABLE 2 taii 
Means AND STANDARD DEVIATIONS -FOR 
TREATMENTs ON Test TRIALS 


Treatment M SD 
No-picture 19.20 | 7.93 
Simple-pieture 11.30 5.79 
Complex-picture 11.60 4.93 


was 11.30; and for Ss in the complex- 
picture condition it was 11.60. Com- 
paring the simple-picture to the no- 
pieture condition on the test trials, 
Ss in the no-picture condition recog- 
nized significantly more words (t = 
4.02, df = 18, p < .01). Comparing 
the complex-picture to the no-picture 
condition, Ss in the no-picture condi- 
tion recognized significantly more 
words (t = 3.87, df = 18, p < .01). 
Discussion 

The purpose of Experiment I was 
to test the hypothesis that when re- 
lated pictures and words are presented 
together, the presence of pictures 
would retard the acquisition of read- 
ing responses. This experiment used 
naive Ss under conditions unlike those 
found in classrooms. The results dis- 
closed that during the 10 acquisition 
trials, when pictures were available 
as incidental cues for appropriate 
verbal responses for Ss in the picture 
conditions, Ss in these conditions gave 
significantly more correct responses 
than did Ss in the no-picture condi- 
tion. On the 10 critical test trials, 
when pictures were not available as 
incidental cues, Ss in the no-picture 
condition gave significantly more 
correct responses. Acquisition and 
test trials were purposely alternated 
so that Ss in the picture conditions 
would be aware that the printed 
words were important stimuli. De- 
spite the alternation of acquisition 
and test trials, Ss in the picture con- 
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ditions tended to use pictures rather 
than words as cues. It would appear 
that pietures functioned as distract- 
ing stimuli in that they drew attention 
away from the printed words. A sim- 
ilar finding has been reported by Un- 
derwood (1963) and Samuels and 
Jeffrey (1966). They report that in 
paired-associate learning, under cer- 
tain conditions, an S may extract 
from the stimulus complex an inci- 
dental eue to which the response gets 
attached. When the stimulus com- 
plex is presented, the S may give the 
correct response, but for the wrong 
reasons. For example, when presented 
h-o-r-s-e, the child says “horse.” He 
may have attached the response to 
the letter “h,” however. When the 
stimulus h-o-u-s-e is presented, he 
says "horse" since he attends only to 
the one letter. 

Subsequent to the completion of 
Experiment I, two doctoral disserta- 
tions were conducted which tested the 
effect of pictures on reading acquisi- 
tion. Despite methodological differ- 
ences, these studies lend additional 
support to the finding that the pres- 
ence of pictures retards reading ac- 
quisition. Using  kindergarteners, 
Braun (1967) found significant dif- 
ferences in reading acquisition favor- 
ing the no-picture group on seven of 
eight comparisons. Harris (1967), 
who used kindergarteners from a low 
socioeconomic background, found sig- 
nificant differences in acquisition on 
four of eight comparisons, favoring 
the no-picture group. Harris attrib- 
uted his failure to find significance on 
more comparisons to the generally 
low level of learning for all his Ss re- 
gardless of experimental condition. In 
both studies, all comparisons which 
did not reach significance were in the 
predicted direction. 
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Experment II 


Method 


Subjects. Fifty-two students from a Min- 
neapolis public school with 7 months first- 
grade experience were used as Ss. 

Design. A treatment by levels design was 
used. The Ss were divided into two matched 
groups. There were 26 Ss in the picture and 
26 Ss in the no-picture condition. Half the 
Ss in each condition were designated as 
above and the other half as below the 
median based on pretest scores. The same 
test was used as pretest and posttest for all 
Ss, regardless of condition. 

Materials. A pretest and posttest were 
used in this study. Both tests were exactly 
the same. The test was constructed by typing 
in large type each of the 50 different words 
used in the story “Fun at Blue Lake” on 
3 X 5 inch index cards, one word to a card. 

The reading material for the no-picture 
and picture condition was the same. A story 
called “Fun at Blue Lake” was written, It 
was 106 words long and contained 50 dif- 
ferent words. The words were typed in large 
type and the story was mimeographed. A 
book was made for each S. The story, “Fun 
at Blue Lake," was pasted on the right face 
of the book for both conditions. The 8s 
in the picture condition had a picture from 
a reading primer pasted on the left face of 
the book. The picture showed a cabin in 
the woods with a lake in the foreground. At 
the lakeshore was a family and their dog. 
In the no-picture condition the left face of 
the book was blank. When the books were 
opened Ss in the picture condition saw the 
text on the right and a picture on the left. 
The Ss in the no-picture condition saw ® 
blank page on the left and the text on the 
right, 


Procedure 


Several days before the experiment 
proper was run, Ss were pretested on the 
50 words used in the story. The pretest con” 
sisted of showing each of the 50 words use 
in the story to S. The words were exposé 
one at a time to S, allowing 10 seconds for 
a response. The Ss were then matched on 
pretest scores and randomly assigne 
either the picture or no-picture condition. 

Reading instruction was given to d 
conditions simultaneously. The groups Me 
separated in the room so that Ss in the n 
picture condition were unable to see . i 
Pictures of the Ss in the picture condition 
Reading instruction was given to smi 
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groups. At no time were there more than 
eight Ss in the room. The instructional 
procedure paralleled that used in typical 
classrooms. Instruction consisted of moti- 
vating and building background for the 
story, reading for a purpose, silent reading, 
and then oral reading. The Ss were in- 
structed to raise their hands during silent 
reading if they were unable to read any 
word, The experimenter went about whisper- 
ing the words to the children who requested 
help. The experimenter was careful to give 
help to students in both groups and both 
groups were given opportunities to read 
aloud. 

Immediately following reading instruc- 
tion, the Ss were given the posttest. They 
were tested by four assistants on the 50 
words used in the story in a similar manner 
to that which was used in the pretest. 


Results 


Two comparisons were made to test 
differences in word recognition on the 
posttest. In the first comparison, as 
seen in Table 3, above-median Ss in 
the picture condition recognized a 
mean of 43.15 words while above- 
median Ss in the no-picture condition 
Tecognized a mean of 42.08 words. 
This difference was not significant. In 
the second comparison, below-median 
Ss in the picture condition recognized 
à mean of 23.69 words, while below- 
median Ss in the no-picture condition 
Tecognized a mean of 26.28 words. 
This difference was significant (t = 
2.73, df = 12, p < .01, one-tailed 
test). 


Discussion 


Experiment II was designed to test 
the effect of pictures on reading ac- 
quisition using a procedure which was 
similar to that used in classrooms. The 
Ss in this experiment had 7 months 
of formal reading instruction. No sig- 
nificant difference was found in read- 
ing acquisition between the picture 
and no-picture condition among the 
better readers, Among the poorer 
Teaders, Ss in the no-picture condition 
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TABLE 3 
Won» Recoenition Scores on Posrrust 
FOR ABOVE AND BELOW MEDIAN SUBJECTS 
For TREATMENTS 


Reading ability M SD 
Below median 
No-picture treatment 26.23 | 8.48 
Picture treatment 23.69 | 7.69 
Above median 
No-picture treatment 42.08 | 4.57 
Picture treatment 43.15 | 6.05 


learned to read significantly more 
words than did Ss in the picture con- 
dition. The results of Experiment II 
support the findings of Silverman, 
Davids, and Andrews (1963) and 
Baker and Madell (1965) in that per- 
formance of the less capable student 
was affected more by distracting 
stimuli than was the performance of 
the more capable student. 

Several questions are raised by the 
two experiments. Considering the ef- 
fect which pictures had on reading ac- 
quisition of naive and less capable 
students, one may wonder if it is good 
practice to put pictures in reading 
primers. Another question left un- 
answered is how the more capable 
student uses pictures when they are 
available. It would seem advisable to 
continue the investigation of the role 
of pictures in readers in terms of mo- 
tivation, student attitude towards 
reading, and attentional processes. 
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PREDICTION OF STUDENT ACCOMPLISHMENT 
IN COLLEGE 
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In samples with a broad range of talent, the academic and non- 
academic achievements of college students were predicted. Criteria 
included college grades, 12 scales designed to measure notable extra- 
classroom accomplishment in college, and 1 scale to assess recognition 
for academic accomplishment. Predictors included scores on ACT 
tests, high school grades, and 6 scales measuring nonacademic accom- 
plishment in high school. Results indicate that nonacademic accom- 
plishment can be assessed with moderate reliability, that both aca- 
demic and nonacademic accomplishment can be predicted to a useful 
degree, and that nonacademie accomplishment is largely independent 
of academic potential and achievement. 


The present study aims to predict 
student achievement in college from 
a comprehensive assessment of stu- 
dent achievement and potential in 
high school. Previous studies de- 
Signed to predict academie and ex- 
tracurricular achievement in college 
for students of superior scholastic 
aptitude (Holland, 1959, 1960, 1961; 
Holland & Astin, 1962; Holland & 
Nichols, 1964; Nichols & Holland, 
1963) are extended by this study, 
which is similar to them in its goals 
and longitudinal method. It differs 
from them, however, in that predic- 
tions are made for students with a 
broad range of academic potential. 

The present study is also related 
to many other investigations of sim- 
ilar problems, Among these problems 
are the relationship between academic 
Potential and originality, the deserip- 
tion of creative persons, the develop- 
ment of criteria of creative perform- 
ance, and the prediction of adult 
accomplishment. Researchers who 
have worked on such problems in- 
clude: Astin (1962); Barron (1963); 
Buel (1965); Chambers (1964); Ci- 
cirelli (1965); Flescher (1963); Get- 
zels and Jackson (1962) ; Gough, Hall, 
and Harris (1963); Guilford (1964) ; 
Hoyt (1966) ; Locke (1963) ; MacKin- 


non (1960); Mann (1958); Price, 
Taylor, Richards, and Jacobsen 
(1964); Skager, Schultz, and Klein 
(1965); Sprecher (1959); Taylor, 
Smith, and Ghiselin (1963); Thorn- 
dike and Hagen (1959); Torrance 
(1963); and Wallach and Kogan 
(1965). 

The rationale for this study is that 
typical measures used in the selection 
of college students—tests of academic 
potential and high school grades—con- 
centrate on only one dimension of 
talent and ignore other important di- 
mensions (Holland & Richards, 
1965). Accordingly, if we want to 
find college students who will do out- 
standing things outside the classroom 
and in later life, we need a record of 
student achievements outside the 
classroom in high school. The present 
study examines the predictive valid- 
ity of one such record of stu- 
dent achievement. 


MetHop 


Predictors 

The predictive variables included the 

wing measures: 

iri ACT tests. The test battery, a college 
admissions test administered nationally, 
yields the following subtest scores: English, 
mathematics, social studies, and natural sci- 
ence. Each score is converted to a common 
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scale with a mean of approximately 20 and 
a standard deviation of about 5 for college- 
bound high school seniors. The reliabilities 
of the ACT tests (American College Testing 
Program, 1965), the high correlations be- 
tween the ACT battery and other similar 
measures (Eells, 1962), and the similar Te- 
lationship of the ACT battery and of simi- 
lar measures to college grades (Munday, 
1965) all indicate that the ACT battery is a 
typical measure of academic potential. 
Therefore, we would not expect markedly 
different results in the present study if we 
had used some other academic test or test 
battery. 

2. High school grades. As a regular part 
of the ACT procedure, persons taking the 
ACT battery report the grades they have 
received in high school courses in four 
areas: English, mathematics, social studies, 
and natural science. Research by Davidsen 
(1963) indicates that in a large sample such 
self-reported grades correspond closely to 
the high school transcripts. A reanalysis of 
Davidsen’s data by the present authors 
yielded a correlation of 92 between stu- 
dent-reported and school-reported grades. 
The measure used in the present study is 
the overall average on a 4-point scale (A = 
4, B = 8, etc.) of all grades reported. In an- 
other study by Hoyt (1964) the predictive 
efficiency of average self-reported grades 
equaled the predictive efficiency of the 
student’s rank in the high school class ob- 
tained from his transcript. 

3. Extracurricular achievement record. 
Checklists of extracurrieular accomplish- 
ment for the high school years were used 
to obtain scores in the following areas: art, 
music, literature, dramatic arts, leadership, 
and science (Holland & Nichols, 1964). 
Items ranged from common and less im- 
portant accomplishments to rare and more 
important ones. For example, science items 
included accomplishments such as: did an 
independent scientific experiment; won a 
prize or award of any kind for scientific work 
or study ; had scientific paper published in 
a scientific journal. The remaining scales 
consisted of similar items planned to assess 
a broad range of achievement. The score 
on each scale is simply the number of ac- 
complishments the student has attained. 

The achievement record was obtained as 
part of the American College Survey. The 
Survey booklet contains several sections de- 
signed to elicit information about a stu- 
dent’s aspirations, achievements, attitudes, 
interests, potentials, values, and background 
(Abe, Holland, Lutz, & Richards, 1965). In 
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the American College Survey sample, thc 
reliabilities (K-R 20) of the achievement 
scales ranged from .72 to 84 for men and 
from .65 to 81 for women. 


Student Sample 


The student sample was obtained from a 
follow-up of students who participated in 
the American College Survey (Abe et al, 
1965). In the original study, a comprehen- 
sive assessment was administered to 12,432 
college freshmen in 31 institutions of higher 
education during the months of April or 
May of 1964. The sample for the present 
study is restricted to the 7208 students at 
22 of the 29 colleges participating in the fol- 
low-up study who also took the American 
College Testing battery in the academic 
year 1962-63 as part of their application for 
admission to college. The record of college 
accomplishments for these students was ob- 
tained in the spring of 1965 at the end of 
their sophomore year in college. 

In September of 1964, a second study in- 
volving the American College Survey was 
conducted in which the same comprehensive 
survey was administered to 5668 entering 
freshmen at six colleges.’ This second sam- 
ple of 2483 is also restricted to the freshmen 
in the larger group who took the American 
College Testing battery as part of their 
application for admission to college. The 
follow-up data for these students also were 
collected in the spring of 1965 at the end 
of their freshman year in college. 

Each college was responsible for the ad- 
ministration of the follow-up questionnaire. 
Several techniques were used to contact 
students: Some colleges had students fill out 
the questionnaire in English classes, convo- 
cations, or other group sessions; other col- 
leges polled their students by mail. Com- 
plete follow-up data were obtained for 2792 


1 Tables showing the colleges for the two 
samples and the details of certain statistical 
analyses discussed later (¢ tests comparing 
students with and without follow-up data, 
complete intercorrelation matrices, ant d 
correlation between academic predictors a0 
all criteria at individual colleges) have been 
deposited with the American Documenta, 
tion Institute. Order Document No. 9 f 
from ADI Auxiliary Publications Projech 
Photoduplication Service, Library of Cor 
gress, Washington, D.C. 20450. Remit jor 
advance $125 for photocopies or $125 s 
microfilm and make checks payable oi 
Chief, Photoduplication Service, Library 
Congress. 
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sophomore students (1373 men and 1419 
women) and 1095 freshman students (503 
men and 592 women). Follow-up data were 
thus obtained for 39% of the sophomores 
and 44% of the freshmen. Students with 
missing follow-up data include both students 
who left college and students still enrolled 
in college who failed to complete the fol- 
low-up questionnaire. 

Because this is a low return rate, it is 
important to know what biases there may 
be in the sample with follow-up data. Ac- 
cordingly, ¢ tests were computed between 
students with and without follow-up data 
on each of the predictor variables in each 
of the groups. While each of these ¢ tests is 
not completely independent of every other 
test (some of the variables are correlated to 
a substantial degree), for the purposes of 
this study, any error introduced is conserva- 
tive since it is more likely that a number of 
significant differences will be found between 
students with and without follow-up data. 

The primary trend in the results is for 
students with missing follow-up data to 
have significantly lower ACT scores and 
high school grades. This is to be expected, of 
course, since this group includes students 
who left college because of academic fail- 
ure. However, because the Ns in this study 
are very large, a small absolute difference 
can be highly significant. The actual differ- 
ences on ACT scores and high school grades 
between students with and without follow-up 
data are not large relative to the standard 
deviations of these yariables. On the ex- 
tracurricular achievement scales, only a few 
differences are significant, and these fall 
Into no consistent pattern. It appears, there- 
fore, that although there are some significant 
differences between students with and with- 
out follow-up data, it is unlikely that the 
results of this study are seriously distorted 
by these differences because virtually a full 
Tange of accomplishment is present in the 
groups with follow-up data. 

9 summarize, because the colleges used 
such diverse means of administering the 
Survey and because there are significant dif- 
ferences between students with and without 
ollow-up data, our samples may not be a 
Precise representation of the college popu- 

tions included, Nevertheless, our samples 
© Tepresent a broad range of students from 

Verse institutions. Because most earlier 
studies of this problem were based on a nar- 
Tow range of talent, the present samples 
Permit a more definitive examination of the 
relationships in question. 
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Criteria of Achievement 


The criterion variables included the fol- 
lowing measures: 

1. College grades. Each student reported 
his grade average for his last college term 
by checking one of the following alterna- 
tives: D or lower, D+, C, C+, B, B+, A 
or A+. Scores from 1 to 7 were assigned to 
these alternatives so that a high score indi- 
cates high grades. 

2. Nonclassroom achievement record. A 
checklist of nonaeademie accomplishments 
was developed to measure achievement in 
the following areas: leadership, social par- 
ticipation, art, social service, science, busi- 
ness, humanities, religious service, music, 
writing, social science, and speech and drama. 
A simple scale was also developed to deter- 
mine public recognition for academic at- 
tainment in college. Each scale is, in & 
sense, a criterion or standard of accomplish- 
ment in an important area of human en- 
deavor. Students with high scores on one 
or more scales are assumed to have attained 
a high level of accomplishment which re- 
quired complex skills, long term persist- 
ence, or originality, and which generally 
received public recognition. A detailed ac- 
count of the rationale, development, and 
statistical characteristics of these scales is 
presented elsewhere (Richards, Holland, & 
Lutz, in press). 

Each scale includes ten items, except the 
Recognition for Academic Accomplishment 
Scale, which has five items. In responding 
to the items, the student marks “yes” for 
those accomplishments which he has 
achieved during college and “no” for those 
which he has not achieved. The score on 
each scale is simply the number of “yes” 
responses. 

Items range from common and less im- 
portant accomplishments to rare and more 
important ones. For example, leadership 
accomplishments included: elected to one 
or more student offices, active member of 
four or more student groups, served on a 
student-faculty committee. Music accom- 
plishments included: composed or arranged 
music which was publicly performed, pub- 
licly performed on two or more musical in- 
struments (including voice) which do not 
belong to the same family of instruments, 
attained a first-division rating in a state or 
regional solo music contest. The remaining 
scales consisted of similar items with con- 
tent appropriate to the various areas of 
achievement. In general, the accomplish- 
ments involve public action or recognition, 
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TABLE 1 


Means AND STANDARD DEVIATIONS ON COLLEGE ACHIEVEMENT 
SCALES ror THE STUDENT SAMPLES 


Men Women 
Scale Freshmen Sophomores Freshmen Sophomores 
M SD M SD M SD M SD 
Scientific achievement 18} .68| .32| .86| .07| .32| .10| .30 
Leadership achievement -63 | 1.30 | .88 | 1.53 | .72 | 1.33 | 1.40 | 1.83 * 
Speech and dramatic achievement .31 | .81| .30| .90| .26| .75| .36| .93 
Artistic achievement .38| .87 | .48|1.07| .67|1.20| .85 | 1.34 
Writing achievement Si] .73| .27| .73| .39| .75| .50| 1.01 
Musical achievement 6| .61| .21| .73| .14| .62] .27| .78 
Social participation .80 | 1.35 | .90 | 1.39 | .77 | 1.30 | 1.03 | 1.34 
Social service achievement .55 | 1.07 | .70 | 1.21 | .83 | 1.24 | 1.20 | 1.40 
Business achievement .54| .91| .608|1.00| .22| .51| .84| .65 
Humanistic-cultural achievement 94 | 1.21 | 1.04 | 1.33 | 1.23 | 1.42 | 1.45 | 1.48 
Religious service -73 | 1.48 | 1.34 | 2.20 | 1.30 | 2.06 | 1.98 | 2.41 
Social science achievement .24| .57| .33| .70| .27| .54| .82| .60 
Recognition for academic accom- 
plishment .14 | .46| .36| .69| .14| .40| .44| .81 


Note.—For freshmen men, N = 503; for sophomore men, N = 1373; for freshmen women, 


N = 502; for sophomore women, N = 1419. 


so that, in principle, they could be verified. 
It was assumed that such possibility of ver- 
ification would lessen student exaggeration 
and allow a comparison of student self-re- 
ports with publie records. 

The nonclassroom college achievement 
scales were administered as a part of a com- 
prehensive follow-up of the American Col- 
lege Survey (Abe et al., 1965). The follow- 
up questionnaire elicited information about 
a college student's achievements, aspira- 
Hon self-concept, satisfactions, and atti- 
tudes. 


REsuLTS 


The means and standard deviations 
of the college-achievement scales for 
the various samples are summarized 
in Table 1. The distributions of the 
nonacademie accomplishments are 
highly skewed and almost dichoto- 
mous, and the standard deviations 
are larger than the means? This 


"The skewness of such distributions has 
had little effect in previous studies, however, 
on Pearson correlations involving similar 
variables (Holland & Richards, 1965). Tt is 
possible that the results of this study are 


skewness occurs because each scale 
contains accomplishments that are 
rare among college students. (The 
modal number of accomplishments 
on most scales is zero.) Differences 


among the areas of accomplishment | 
probably reflect differences both in | 


the level of accomplishment repre- 
sented by the various items and in 
the opportunity for various kinds of 
achievement in college. 

As a next step, correlations Were 


computed among all of the variables, | 


both predictor and criterion.’ In gen- 
eral, there are: (a) moderate correla- 
tions among measures of aoa 
potential and performance, (b) mod- 
erate correlations among nonclass- 
i use of skewed, almost 
etm vetitis in the multiple re- 


gression analyses, although the consistency — 


and meaningfulness of the findings 
that such distortion is unlikely. a 
2 Computations for this study ub 
ried out at Measurement Research ( vty 
University of Iowa, and at the Unive! 


of Utah Computer Center. 
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room achievements in othe -same or 
elosely related areas, (c) low to mod- 
erate correlations among nonclass- 
room achievements in areas which 
are not closely related, and (d) low 
relationships between nonclassroom 
achievements and measures of aca- 
demic potential and performance. The 
. relationships are consistent with what 
previous investigators have found 
(Holland, 1959, 1960, 1961; Holland 
& Astin, 1962; Holland & Nichols, 
1964; Holland & Richards, 1965; 
Nichols & Holland, 1963). 

The most important of these find- 
ings is the low relationship between 
nonclassroom achievements and meas- 
ures of academic potential and per- 
formance. These correlations are 
based, of course, on combining stu- 
dents at the various colleges into a 
single group. Although it is unlikely, 
this low relationship might be an 
artifact of combining students in dif- 
ferent colleges. To check this pos- 
sibility, the correlations between aca- 
demic predictors and all criteria for 
male sophomores at individual col- 
leges were computed. This analysis 
was restricted to the 14 colleges hav- 
ing 25 or more students with complete 
data. 

. The results indicate that there is 
indeed considerable variation among 
colleges in the relationship between 
individual predietors and individual 
Criteria. However, the median corre- 
lations in every case are very close to 
the corresponding correlations which 
Were calculated using all students 
Combined. Moreover, the differences 
among colleges apparently are more 
Tandom than consistent and mean- 


. ingful. These results indicate, there- 


iore, that combining students from 
different colleges has not distorted 
the relationships. Correlations at in- 
dividual colleges for the other sam- 
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ples and other variables in this study. 
supported this interpretation. 

As our next step, we computed mul- 
tiple correlations by selecting the 
most efficient predictors of each cri- 
terion from the 11 predictor variables. 
A step-wise multiple-regression com- 
puter program was used which, at 
each step, adds the variable which 
most improves prediction. This com- 
puter program computes an F test 
after each step to test the significance 
of the reduction of residual variance 
caused by the addition of the variable 
in that step. For the final multiple- 
regression equation, the computer re- 
tains only those variables producing 
a significant reduction in residual 
variance. 

However, many variables which 
produced a statistically significant 
reduction in residual variance had no 
practical effect on the size of the mul- 
tiple correlation. Accordingly, rather 
than using a statistical test, only 
those variables were retained which 
increased the multiple correlation by 
at least .01. In every case, the number 
retained using this criterion is smaller 
than the number retained using a 
statistical test of significance as the 
criterion, 

Eight of the criterion variables— 
college grades, leadership, art, science, 
music, writing, speech and drama, 
and recognition for academic accom- 
plishment—were designed specifically 
to assess at the college level the same 
characteristics the predictors measure 
at the high school level. The beta 
weights and multiple correlations for 
these criteria for freshmen are sum- 
marized in Table 2 and those for 
sophomores are summarized in Table 
3. 
The most notable finding in Tables - 
2 and 3 is the great importance of 
specific content in predicting achieve- 
ment. For the nonacademic accom- 
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plishment scales, the best predictor of 
accomplishment in college is similar 
accomplishment in high school, and 
in the majority of cases similar high 
school accomplishment is the only 
variable contributing to the predic- 
tion of college accomplishment. More- 
over, in every remaining case, the 
prediction of nonacademic accom- 
plishment is improved only slightly 
by adding variables to the corre- 
sponding high school achievement 

„ scales—an improvement likely to dis- 
sppear on cross-validation. These 
“findings are consistent, of course, with 
a substantial literature which reveals 
that past performance predicts future 
performance. 
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For the two measures of academic - 
accomplishment, the most consistently 
high predictor is high school grades, 
and, in general, some weighted com- 
bination of high school grades and: 
ACT test scores is a better predictor 
than high school grades alone. This : 
finding, too, is consistent with a large 
number of previous investigations of 
the prediction of academic perform- 
ance, 

The information in Tables 2 and 3 
also confirms earlier findings that 
academic potential and success have 
little relationship to effective non- 
academic performance (Astin, 1962; 
Getzels & Jackson, 1962; Gough et | 
al., 1963; Holland & Nichols, 1994; 


TABLE 2 
MULTIPLE CORRELATIONS ror COLLEGE FRESHMEN FOR CRITERIA OF ACHIEVEMENT HIGHLY 
COMPARABLE TO THE HIGH SCHOOL ACHIEVEMENT SCALES 


Men Women 
Criterion 
Predictors Beta | r Predictors Beta | r 
College grades ACT social studies +2406 | .29 | High school +2874 | .37 
High school rs “Tate | :82 | AGT soctel studios 2580 | 44 
Art achievement (HS) | 10838 | .33 
Leadership achievement (Col.) +2831 | .29 | Leadership achieve- .1994 | .25 
Ky NM ment (HS) 
Drama achievement +1538 | .32 | Drama achievement 1798 | .30 
(H8) (HS) 
Artistic achievement (Col) | Art achie t (AS -3804 | 41 hi t (HS) me | 
K E vemen! Gs) ko Art achievement (H8) 
(H8) 
Scientific achievement (Col. | Science achievement — .81 | Science achievement - E 
T (H8) (H8) 
Musical achievement, (Col. Music achie t (HS)| .4157 | . ic achi t (HS)| .3310 | .35 
MEM C E IRI EIE 
Writing achievement (Col.) Literary achievement | — iterary achievement | 8999 M 
ib. T 1439 | -46 
(HS) 
Speech & drama achievement Dam achievement .8391 | .34 | Drama achievement i Eus 
ACT natural science — |—.0861 | .35 
ition f. i 1079 | -19 
peer Beene vs heel n 3 E n: 1211 | .25 
(HS) . ‘ment (HS) elitr 
Leadership achie 1028 | .29 | ACT mathematics +1878 | +59 
ment (HS) Drama achievement +0918 | - 
cABU Sce .0805 | -30 
(HS) 


Note.—In this and the foll three i tion by at least .01 are 
oi in this and the following gr ties, caly variables incresaing the multiple correlation by tere ‘of that 


retained. The correlation beside each is the 


For men, N = 503; for women, N = 


variable plus those listed above it. bbreviationa in parenthorse reece: fallen Cal College, HS = 


= High terion jebool. 


— 


b 
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TABLE 3 


MULTIPLE CORRELATIONS FOR COLLEGE SOPHOMORES FOR CRITERIA OF ACHIEVEMENT 
HIGHLY COMPARABLE TO THE HIGH SCHOOL ACHIEVEMENT SCALES 


Men 
Criterion 
Predictors 
College grades High school grades 
ACT social studies 
Leadership achievement (Col.) achieve- 
ment (H8) 
rama 
RBA n 
t (HS) 
Artistic achievement (Col.) Art achievement (HS) 


Scientific achievement (Col.) 
Musical achievement (Col.) 


Writing achievement (Col.) 


Speech & drama achievement | Drama achievement 
(Col.) (HS) 
iterary achievement 
(H8) 
Recognition for academic High school grades 
accomplishment (Col.) Al ish 
Literary achievement 
38; 
ACT mathematics 


Women 
Beta | r Predictors Beta | r 
3168 39 | High school grades 
“gon | 244 | ACT English des | d 
ACT social studies 11585 | 50 
1857 28 | Leadership achieve- 2984 | .35 
ment (HS) 
+1253 31 | ACT social studies .1288 | .88 
1036 33 | Literary achievement .1092 | .3f 
(H8) i 
-— .44 | Art achievement (HS) = „51 
- -40 | Science achievement — .24 
(H8) 
- .49 | Music achievement = 89 
(H8) 
- .45 | Literary achievement - 
eres 'vemen! EG 
.2802 | .33 | Drama achievement E 
GS 900 | .39 
.0928 34 | ACT mathematics —.0900 | .40 
2113 34 | High school grades .2019 | . 
1276 | .39 | ACT English 1630 3 
466 | .42 | ACT natural science .1541 | .45 
.MT0 | .43 


Note.—Abbrevintions in parentheses are as follows: Col. = College, HS = High School, For men, N = 1378; for 


women, N = 1419. 


Hoyt, 1966; MacKinnon, 1960; Price 
et al, 1964; Thorndike & Hagen, 
1959; Torrance, 1963). In these tables, 
academic predictors relate to aca- 
demic criteria and nonclassroom pre- 
dietors relate to nonclassroom cri- 
teria, Thus there is both convergent 
and discriminant validity. This is 
especially important in the case of 
the Recognition for Academic Ac- 
complishment Seale. This scale is a 
self-report. of achievements compar- 
able to the nonclassroom achievement 
scales, Furthermore, the items for 
this seale were mixed with items from 
the nonclassroom achievement scales 
in the same section of the follow-up 
questionnaire. Unlike the nonclass- 
room achievement scales, however, 
this scale should be correlated with 


academie predictors. Because this 
scale was correlated with academic 
predictors and the nonclassroom 
achievement scales were not, the re- 
sults make it less plausible that re- 
sponse bias, dissimulation, or simi- 
lar occurrences invalidate student 
responses to these scales. In other 
words, the results imply that the 
average student gave a frank account 
of his accomplishments in high school 
and in college. 

The remaining six criterion scales 
make our assessment of student ac- 


' eomplishment more comprehensive; 


but they were not planned to measure 
achievement in the same areas meas- 
ured by the high school achievement 
scales. It was expected, then, that 
the multiple correlations between the 
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criteria and the predictors would be 
lower than the correlations for the 
eriteria that are highly comparable 
to the high school achievement scales. 
The multiple correlations for these 
criteria are summarized for freshmen 
in Table 4 and for sophomores in 
Table 5. 

The multiple correlations in Tables 
4 and 5 are much lower than the mul- 
tiple correlations in Tables 2 and 3. 
In Tables 4 and 5, there is some 
tendency for those scales that are 
most similar to the high school 
achievement scales to be most pre- 
dietable, and for the most similar 
high school scale to be the best pre- 
dictor of the score on the similar col- 
lege achievement scale. For example, 
high school Leadership Achievement 
is the best predictor of college Social 
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Participation, and high school Liter- 
ary Achievement is the best pre- 
dictor of college Humanistic-Cultural 
Achievement. For the most part, the 
correlations in Tables 4 and 5 support 
the conclusion that academic predic- 
tors contribute little to the prediction 
of nonclassroom accomplishment. 
Again, probably the most striking 
thing suggested by Tables 4 and 5 
is the importance of specific content. 
For the college criteria having no 
corresponding high school predictors, 
the variables selected for predicting 
the various criteria, and their beta 
weights, are not highly comparable 
for freshmen and sophomores. One 
would expect, therefore, the already 
low multiple correlations to drop on 
cross-validation. Consequently, a bet- 
ter approach to predicting these var- 


TABLE 4 
MULTIPLE CORRELATIONS ror COLLEGE FRESHMEN FOR CRITERIA OF ACHIEVEMENT NOT 
HIGHLY COMPARABLE TO THE HIGH SCHOOL ACHIEVEMENT SCALES 


TAS Men Women 
Criterion 
Predictors Beta r Predictors Beta r 
Social participation (Col.) Leadership achieve- .2325 | .28 | Leadership achieve- .2482 | 33 
ment (H8) ment (HS) 
cM achievement. 1365 | .31 GER achievement, 1984 | .38 
m [^ ndi +1717 | 32 RD achievement .1015 | .40 
[7.1871 | .34 (HS) 
Social service achievement Drama achievement -1048 | .13 | Leadership achieve- .1697 | .21 
Col.) (H8) ment (H8) on 
Music achievement .1048 | .17 | Drama achievement 1492 | - 
(H8) (HS) 
Business achievement (Col. | ACT English 1377 | .14 | ACT English — . 0800 dm 
Music achievement. —.0759 | .16 | Drama achievement .0812 | + 
(H8) (HS) 16 
ACT natural science — |[—. 0800 | - 
Humanistic-cultural achieve- | Literary achievement | .2888 | .33 | Lit hievement 2730 | -33 
ment (Col.) (HS) n EST, al 
Drama achievement +0958 | .34 | Art achievement (HS) 1198 » 
(Hs) Drama achievement M07 
(H8) 
Religious service (Col.) Music achievement. 1434 | .14 Dra achievement 1221 | -13 
E ( 14 
h —.1143 18 | ACT English —.0620 | . 
pi Music achievement. 10581 | +15 
(H8) 
Social Sons achievement Kid achievement 0915 | .10 | Literary achievement 1470 | .20 
^ HS; 
AST mathematics Ema 7a. | Drena sabisyonent «1138 | -28 
ACT social studies -1794 | .18 (Hi s 0922 | 25 
ACT natural science — |—.1059 | .20 | Art achievement (HS) | - 


Note. jg iren in parentheses are as follows: Col. = College, HS = High School. For men, N = 508; Tr 


women, N 
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TABLE 5 


MULTIPLE CORRELATIONS FOR COLLEGE SOPHOMORES FOR CRITERIA OF ACHIEVEMENT 
Nor HIGHLY COMPARABLE TO THE HIGH SCHOOL ACHIEVEMENT SCALES 


sca Men Women 
Criterion 
Predictors Predictors r 
Social participation (Col.) ij deve- ip achi 
ment (HS) ment GIS) iis i 
eras achievement, Litera achievement 85 
High school grades i 
aa paras High school grades .96 
AGE social studies 
Social service achievement Leadership achieve- ij ieve- ; 
(CoL) ment (HS) Lent (s) ev Si 
Des it DS achievement 25 
High school grades Scien ái 
E cobere ^ racy achievement .26 
(H8) 
Business achievement (Col.) iene achievement Drama achievement 0985 | .18 
HS. 
Ws achievement Ladera reat 0820 | .15 
ment 
h school grades it dies be 
Hig ne ACT social stu. „0817 | .16 
ment (H8) 
Humanistic-cultural achieve- ii ii 
aman rah ural achieve: Nod achievement. NM achievement +1899 | .29 
ACT social studies ACT social studies 1453 | .92 
ACT mathematics Drama achievement +1257 | .35 
Art achievement (HS) Gas) 
Art achievement (HS) .1086 | .36 
Religious service (Col.) ss achievement. Musio purus M94 | .18 
Leadership achieve- ACT social studies = B 
ment (HS) Drama achievement -0611 | .16 
ACT social studies (H8) 
Social soi à i i i i 
creme achievement Deos achievement ens achievement 1729 | .19 
pede achieve- Art achievement (H8) «1008 | .21 
ment (HS) 
High school grades 


Note.—Abbreviations in parentheses are as follows: Col. = College, HS = High School. For men, N = 1373; for 


women, N = 1419. 


lables would seem to be to construct 
a high school achievement scale cor- 
responding closely to the college 
achievement scale. When predictors 
are available which are expected to 
have substantial validity on rational 
grounds and on the basis of previous 
Tesearch, as was the case with the 
highly comparable high school and 
college achievement scales in this 
study, they may not necessarily be 
improved (on cross-validation) by 
adding variables selected from a large 
humber of predictors to maximize 
the multiple correlation. Indeed, be- 
Cause the multiple correlation may 


weight the single, dependable predic- 
tor inappropriately in the process of 
combining it with other variables, the 
validity of the weighted combination 
may actually be lower than the valid- 
ity of the single variable alone in a 
new sample (Holland & Nichols, 
1964). 


Discussion 


The present study demonstrates 
that it is possible to predict nonaca- 
demic accomplishment with moderate 
success, and it extends the similar 
findings of earlier research on stu- 
dents with high aptitude by showing 
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that this is true for students with a 
broad range of academic potential. 
To illustrate, the median correlation 
between student nonacademic accom- 
plishment in high school and in col- 
lege in the same area of endeavor is 
about .39; the median correlation be- 
tween ACT scores and college grades 
is about .29; and the median corre- 
lation between grades in high school 
and in college is about .38. These 
values are not strictly comparable, 
of course, for at least two reasons: 
many students in the original sam- 
ple left college because of low 
grades; and we did not correlate 
individual ACT tests with grades in 
specific courses. Nevertheless, the 
results suggest that the predictive 
validities of the high school accom- 
plishment scales* are about as high 
for comparable criteria as the pre- 
dictive validities of the ACT tests. 
This study, therefore, is the cul- 
mination of our research to establish 
that some nonacademic accomplish- 
ments are independent of academic 
potential and accomplishment (Hol- 
land & Richards, 1965, in press), that 
nonacademic accomplishment can be 
assessed with moderate reliability 
(American College Testing Program, 
1965; Richards et al., in press), and 
that nonacademie potential can be 
predicted with moderate success (Hol- 
land & Nichols, 1964). The evidence 
also makes it unlikely that our re- 
sults can be attributed to nonlinear 
relationships between academie and 
nonacademie accomplishment (Hol- 
land & Richards, 1965), to defective 
sealing of nonacademic accomplish- 
ments (Holland & Nichols, 1964; 
Holland & Richards, 1965), to a 
narrow range of student talent (Hol- 
land & Richards, 1965, in press), to a 
student’s distortion of his nonaca- 


‘For the following six scales: Science, 
Art, Music, Literary, Drama, and Leader- 
ship. 
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demie accomplishment (Holland & 
Richards, in press; Richards et al, 
in press), or to the effects of some 
moderator variables (Holland & 
Richards, in press). These results 
also support many of the findings of 
investigators of creative and effective 
performanee (Gough et al, 1963; 
MacKinnon, 1960; Price et al., 1964; 
Thorndike & Hagen, 1959; and oth- 
ers). The recent review by Hoyt 
(1966) provides still another impor- 
tant piece of evidence that classroom 
grades bear little or no relationship 
to measures of adult accomplishment. 

Because our criteria of nonacademic 
accomplishment are only a sample of 
such accomplishment, possibly aca- 
demie potential and accomplishment 
may have substantial positive corre- 
lations with some nonacademic ac- 
complishments. The negligible rela- 
tionships observed so far, however, 
make this possibility unlikely. While 
only an exhaustive examination of 
nonacademic accomplishments could 
negate this possibility, some relevant 
evidence is provided by the six new 
criteria of nonacademie accomplish- 
ment® developed for this study. The 
negligible relationships between meas- 
ures of academic potential and per- 
formance and these new criteria of 
nonacademie accomplishment rem- 
force earlier findings and lessen the 
possibility of finding some substantial 
positive correlations. 

As always, the present research 
leaves a number of closely related 
questions unanswered. It is not yet 
known whether nonclassroom accom- 
plishments in high school and college 
are good predictors of similar accom- 
plishment in adult life. Little is eet 
about the college experiences tha 


* These criteria are: Social Participation, 
Social Service Achievement, Busines 
Achievement, Humanistic-Cultural Aohieyty 
ment, Religious Service, and Social Seien 
Achievement. 
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facilitate and inhibit the expression 
of talent in college after a record of 
talented performance is made in high 
school. The apparent contradictions 
between the findings of Terman and 
Oden (1959) and the findings of more 
recent investigations, such as the pre- 
sent study, need to be resolved. Sim- 
ilarly, the relationship of such work 
as Thurstone's (1938) primary men- 
tal abilities and Vernon's (1950) hi- 
erarchy of abilities to nonacademic 
accomplishment requires explication. 
A theory of human accomplishment 
encompassing our notions of intelli- 
gence, aptitude, nonacademie accom- 
plishment, and originality would help 
us find answers to these questions. 

Some of the practical applications 
of the findings seem clear. Measures 
of academic potential are the chief 
methods used to determine admission 
to college (Committee on School and 
College Relations, 1964). So long as 
one is interested only in finding stu- 
dents who will do well in the class- 
room in college, this emphasis is ap- 
propriate. But the emphasis in col- 
leges and universities on academic 
potential, because it concentrates on 
only one of several independent di- 
mensions of talent, has led to neglect 
of other equally important talents. 
Certainly, in the interest of social 
and human values, one should also be 
interested in finding students who 
will do outstanding things outside 
the classroom and in later life. 

We should, therefore, continue to 
develop and improve measures of 
many kinds of achievement and of 
originality, Further, we should con- 
Sider such measures important in their 
own right, and not weak supplemen- 
tary measures to remedy the slight 
defects of conventional aptitude and 
achievement tests. At the same time, 
We should not make the same mistake 
that the proponents of aptitude and 
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intelligence have made in the past; 
that is, to rely on only one kind of 
measure and to exclude others. The 
resulis support some of the items 
used to obtain information about non- 
classroom accomplishment in typical 
applieation blanks for admission to 
college, scholarships, and fellowships, 
but they also suggest the potential 
usefulness of a more reliable and 
valid record of each student's past 
achievement and involvement. 

The implications of this study, how- 
ever, extend beyond a need for a 
more systematic and comprehensive 
assessment of student accomplishment 
outside the classroom for purposes of 
admission or selection. At the very 
least, the findings imply a need to 
examine college grading practices, 
since college education should be 
largely a preparation for participa- 
tion in important areas of human en- 
deavor. Because college grades best 
predict graduate grades, current grad- 
ing practices imply that a college 
edueation is mainly preparation for 
more education in graduate school. 
The criteria of nonacademie accom- 
plishment, in combination with col- 
lege grades, provide a brief set of 
socially relevant measures which 
could serve as more comprehensive 
criteria of college success. Using these 
scales as guides, similar scales can be 
developed to increase our ability to 
assess student attainment of the 
broader goals of a college education. 
Moreover, once the simple principles 
of constructing such scales are 
grasped, it should be easy to develop 
scales to satisfy a particular college's 
unique needs. 

Further, the results imply a need 
for a broader, or different, definition 
of both the nature of human talent 
and the nature of higher education. 
There are many kinds of human ac- 
complishment, and each kind is likely 
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to benefit from some type of higher 
education, although not necessarily a 
highly academic type. In other words, 
our results imply a need for a wide 
variety of colleges, many, if not most 
of them, relatively unselective except 
on dimensions clearly relevant to 
their particular emphasis. Measures 
of academic and nonacademic ac- 
complishment would then be used in 
helping students find an appropriate 
college, rather than being used in 
selecting students for a single college. 
As one critic of education said, a 
society (or a system of higher edu- 
cation) is “in a desperate way when 
its music makes little difference 
[Goodman, 1966].” Despite contrary 
protestations, most institutions of 
higher education rely heavily on aca- 
demic aptitude and grades in select- 
ing and evaluating students. Music 
and other important human accom- 
plishments make little difference. 
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LEARNING VERBAL AND SYMBOLIC STATEMENTS OF 


MATHEMATICAL RULES 


JOSEPH M. SCANDURA 
University of Pennsylvania 


24 college Ss learned 4 rule statements and were then tested on 2 
problems to see if they could apply the rules involved. A 2 X 2 fac- 
torial design, with repeated measures, was used. 1 factor was the 
form in which the rules were stated, either succinctly worded Eng- 
lish or initially unfamiliar mathematical symbolism. The other fac- 
tor involved the presence or absence of symbol pretraining on the 
meaning of the constituent symbols. The results were: (a) sym- 
bolic statements were applied successfully if and only if the Ss were 
taught the symbol meanings (and the underlying grammar); (b) 
symbolic statements were learned more rapidly with or without sym- 
bol pretraining; (c) English statements were applied equally as 
well as those symbolic statements which followed symbol pretrain- 
ing; (d) rate of learning the English statements was unaffected by 
symbol pretraining; (e) success on 1 application problem implied suc- 


cess on the other. 


In recent years, there has been in- 
ereasing concern with the proposition 
that the effectiveness of a given mode 
of presentation depends on learner 
ability. This supposition was origi- 
nally interpreted to mean that apti- 
tude profiles could be used to predict 
which mode of instruction would be 
most effective in promoting learner 
achievement. Thus, people who are 
high in spatial but low in verbal and 
symbolie ability, for example, pre- 
sumably would do best when concepts 
are presented in spatial form. Similar 
interactions would be expected with 
different profiles. 

Gagné (1960) was one of the first 
to propose this sort of interaction hy- 
pothesis with respect to mathematics. 
Although there have been some studies 
since then which tend to support this 
point of view (e.g., Guilford, Hoepf- 
ner, & Peterson, 1965; Osburn & Mil- 
ton, 1963), most of the studies indi- 
cate that aptitude profiles are not 
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nearly as strong predictors of learning 
as are immediately requisite abilities. 
In dealing with hierarchially ordered 
tasks, for example, Gagné and his 
collaborators (Gagné, 1962; Gagné, 
Mayor, Garstens, & Paradise, 1962; 
Gagné & Paradise, 1961) found that 
the relationship between successful 
criterion performance and ability, as 
measured by aptitude tests, became 
progressively weaker as learning 
moved towards the hierarchical apex. 
Conversely, the relationship between 
criterion performance and acquired 
subordinate knowledge, as determined 
during the learning sequence, became 
progressively stronger. Largely as 4 
result of these studies, the general 
validity of the task-analysis proce- 
dure, introduced in education by 
Gagné and his co-workers, has become 
so well accepted that the basic pro- 
cedure has been extended by others— 
for example, to deal with process as 
well as content objectives (Kersh, 
1967). 

Unfortunately, however, there has 
been no attempt to vary the form in 
which new material (whether relat- 
ing to content or process) is presented. 
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The important question remains as to 
whether aptitude measures, while 
perhaps not as critical to learning as 
immediately requisite abilities, may 
still be the best predictors of which 
"form of content" should be used. 

In a recent study, Scandura (19662) 
found that the repeated reintroduction 
of passages of technical material did 
not diminish the initial advantage 
gained by those subjects (Ss) who had 
previously been required to demon- 
strate a high proficiency with the 
terminology used. This result ob- 
tained even though all of the Ss were 
exposed, in varying but parallel de- 
grees, to definitions of and practice 
with the prerequisite terms before the 
technical material was initially intro- 
duced. In discussing the results of 
their study on rule generality, Scan- 
dura, Woodward, and Lee (1967) also 
suggested that the ability to interpret 
a given rule statement may depend 
critically on the terms used to com- 
Pose the statement. It would appear 
that specific interpretive abilities may 
be more fundamental than any gen- 
eral aptitude measure, no matter in 
what form information is presented. 
More particularly, the interpretation 
of a statement may depend on certain 
Tequisite abilities which can be log- 
ically determined directly from the 
to-be-presented material itself in a 
manner analogous to that used in task 
analysis (Gagné, 1962; Kersh, 1967) 
—that is, by asking the question, 
"What does S need to be able to do in 
order to interpret . . .?” : 

. It was with this general orientation 
in mind that the present study was 
designed to help clarify the role sym- 
bolism plays in learning mathematical 
Tules. (See Scandura, 1966c, 1967; 
Scandura et al, 1967, for a precise 
definition of a rule.) Both the symbols 
actually used to construct rule state- 
ments and the ability of an S to in- 
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terpret symbols were varied. It seems 
almost axiomatic that the ability to 
interpret a rule statement depends 
critically on the ability to interpret 
the symbols of which it is composed 
(Scandura, 1966a), be they mathe- 
matical symbols or elements of the 
native language (eg, English). 
Whether symbol meanings are a 
sufficient condition for rule interpret- 
ability, in the deterministic sense, 
however, is not at all clear. Further- 
more, in mathematics learning the use 
of mathematical symbolism is fre- 
quently, if not always, preferred to 
ordinary English. The reason why, 
however, is never made explicit. 

The purpose of this study was to 
determine whether: (a) rules are more 
easily memorized when stated in 
mathematical symbolism or when 
stated verbally (i.e., in English), and 
(b) the ability to use constituent sym- 
bols correctly, assuming mastery of 
the underlying grammar, is a neces- 
sary and/or sufficient condition for 
applying a learned (i.e., memorized) 
rule statement. 

There are, of course, innumerable 
ways in which a given rule may be 
stated and it would be a tedious task 
indeed to make all possible experi- 
mental comparisons. Unless general 
relationships can be found without 
making all (or even many) such com- 
parisons, research on the question of 
whether rules are more easily memo- 
rized in symbolic or verbal form would 
be as fruitless and as hopelessly com- 
plex as traditional methods research 
in which two or more ill-defined in- 
structional procedures are compared. 

Hopefully, the present study does 
not fit that classification. An inde- 
pendent variable, called description 
level, has been defined (Scandura, 
1966c) which may be directly related 
to the rate of learning statements of 
a given rule. Statement A is satd to be 
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of a. higher description level than B 
with respect to some reference sym- 
bolism (usually the native language) 
if the translation of A into the refer- 
ence symbolism requires: all those ref- 
erence symbols needed to translate B 
plus some additional symbols of its 
own. Statements generally can be only 
partially: ordered according to this 
definition. That is, some statements 
may not be directly comparable as to 
deseription level, Nonetheless, if a 
positive relationship is found between 
description level and rate of learning, 
as hypothesized in this study, it should 
be possible, by interpolation and/or 
extrapolation, to make confident pre- 
dictions as to the relative rates of 
learning new rule statements which 
take on unstudied values of descrip- 
tion level, In the present study, two 
values of the description level variable 
Were compared, statements in the na- 
tive language (Le, English) and 
Statements of identical rules composed 
of mathematical symbols. 

Data regarding the second purpose 
of the study were obtained after all of 
the rule statements were well learned, 
In this regard, a secondary result of 
research by Scandura and his collab- 
orators (Scandura, 1966b, 19660, 
19672; Scandura et al., 1967) suggests 
that, under certain conditions, any 
one instance of a rule would provide 
an adequate test of interpretability, 
It was found that if the appropriate 
Tesponse could be given to one stim- 
ulus within the scope of a rule, then 
the appropriate response to any other 
within-scope stimulus could also be 
given. In short, no additional infor- 
mation was gained by using more than 
one test instance, Even so, two test 
instances were used in the present 
study, thereby making it possible to 
gain further information on the im- 
plied consistency hypothesis. 


ScANDURA 


METHOD 
Design and Subjects 


A 2 X 2 factorial design, with repeated 
measures, was used so that each of the 24 
Ss effectively served as his own control. One 
factor was the form in which the rules were 
stated, either succinctly worded English or 
mathematical symbolism. The other factor 
involyed the presence or absence gimbal 

retraining on the meaning of the con- 
Kituent mathematical symbols. The four 
rules (eight statements) used were counter- 
balanced over the four treatment-combina- 
tions so that, each of the 24 (4!) ways of 
assigning 4 rules to 4 treatments was used 
once—one permutation per S. All other 
factors were randomized, including presenta- 
tion order. 

The 22 female and 2 male Ss were 
Florida State University elementary educa- 
tion majors taking a methods course in 
mathematics. Participation was not required 
but was encouraged. 


Materials and Procedures 


The Ss reported for the experiment one 
at a time. The S was told, 


This experiment is designed to determine 
how people learn rules for solving mathe- 
matical problems. I am going to present 
some rules for you to learn. After you 
learn these rules, I will give you some 
mathematical problems to see if you can 
apply these rules. 


Before learning the rules, S was given 
some common pretraining to be sure that 
he understood the way certain terms were 
to be used. In particular, he was told: (a) 
to perform operations within the innermost 
parentheses first before performing addi- 
tional operations, (b) what the consecutive 
integers from 1 through k were, and (c) that 


I 
Ti 

was a vector with x in the first position 
(row) and zs in the second position, (d) that 
az"y" is an algebraic expression, and (e) 
that the greatest integer in a number is the 
largest integer less than or equal to that 
number. In each case, § was presented with 
one illustration and was required to com- 
Plete, in a correct manner, four practice 


exercises based directly on the information 
given. None of this information was deemed 
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TABLE 1 
RULE STATEMENTS 


: Verbal form 


jo 


expression resulting from Step 1 by 1. 

. Multiply the algebraic expression result- 

ing from Step 2 by the exponent of y. 

. Reduce the exponent of y in the algebraic 
expression resulting from Step 3 by 1. 

quired to learn this information before go- 


ing on. In each case, the presumably un- 
familiar symbols used to compose the rules 


G reek LI r LI " * 
Yys( D+ i) 1. aedi Low consecutive integers from 1 
aa rr 2. Add the consecutive integers from 1 
thru s. 
- |3. Add the two sums obtained in Steps 1 
and 2. 
4. Add the consecutive integers from 1 
thru t. 
5. Multiply the sum resulting from Step 3 
by the sum resulting from Step 4. 
Integer ((z] + ly) 1. Take the greatest integer in z. 
2. Take the greatest integer in y. 
3. Divide the result of Step 1 by the result 
of Step 2. 
4. Take the greatest integer in the quotient 
in Step 3. 
we (qaod (sls «Iz 
2: © a u is [o] p 1. Multiply the first number in Hy by the 
first number in | yı | - 
y: 
2. Multiply the second number in Iz by 
T: 
the second number in || yı 5 
ys. 
3. Add bs two products obtained in Steps 1 
and 2. 
4. Multiply the first number in | zi by the 
LZ] 
first number in | az 
AR 
5. Multiply the second number in 2| by 
T 
the second number in || zı 
Za ||- 
6. Add the two products obtained in Steps 4 
and 5. 
7. Add the two products obtained in Steps 3 
k and 6. 
Partial D,(Dz(az"y")] 1. Multiply the given expression az^y" by the 
exponent of 2. 
2. Reduce the exponent of z in the algebraic 
3. 
4. 


sufficient for understanding any of the sym- 
bolic rule statements. 


Next, S was given specific information "i 
an 


about the mathematical symbols used to 
Compose two of the four rule statements 
(see Table 1) later introduced. He was re- 


were defined, an example was given, 
four tasks, similar to the example, were 
presented for S to practice what he had been. 
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told. The definition corresponding to the 
Greek rule was stated : 
To find the number equal to 
k 
Èr 
ral 


add the consecutive integers from 1 thru 
k 


The illustration was 
6 
Xt=1+24+3+4+5+6=21 
tel 


and the practice problems were identical in 
form. The other definitions were stated: 

Integer—the brackets about the numeral 
A, [A], mean take the largest integer less 
than or equal to A, e.g. [3.57] = 3; 


Vector—if 
lela 
T: ya 

are vectors, then to get the number equal to 

Isle]: 
(a) multiply the first number in 

lzi 
Ta 

by the first number in 


A 
LM 


(b) multiply the second number in 


zi 
Tı 


by the second number in 


Isl 

yl 

(c) add the two products obtained in Steps 
a and b, e.g, 


15] - || "x54 - 18 + 20 — as; 


Partial—to find D.(azx™y") 


(a) multiply az™y bym 
(b) reduce the exponent of z in the algebraic 
expression resulting from Step a by 1, 
for example, to find the algebraic expres- 
sion equal to D. (52*y*); Step a gives 
202*y*, Step b gives 207*y* = D.(5z*y). 
Just before the rules were presented, S 
was told, 


The name of each rule, an introductory 
Statement, and the corresponding rule 
have been typed on a card. The name 
and introductory statement have been 
typed in black and the rule you are to 
learn in red. Your job will be to memorize 
that part typed in red so that you can 
write the rule correctly whenever you see 
its name. 

You will have a given period of time to 
study each rule statement that I show 
you. After you have studied each rule, I 
will present each name, in turn. Your job 
will be to learn all four rules so that you 
can write the rule statement correspond- 
ing to each of the four names shown. 
You will have all the time you need but 
please work rapidly. If you haven't 
learned the rule, write as much of it as 
you can and we will then go on. 


To be sure that S understood the procedure, 
the experimenter presented four distinct 
sample rules in the manner indicated, Then, 
S was told that he would have to write each 
rule several times without a mistake and 
that more time would be allowed on succes- 
sive trials. The S was also reminded that he 
would be required to apply what he had 
learned. 

The order of presentation was random- 
ized on each trial and S was told whether 
he was right or wrong after each attempt to 
reproduce a rule statement. The test phase 
of each trial took place after all four rules 
were presented. On Trial 1, 5 seconds were 
allowed for the study of each rule ; on "Trial 
2, 10 seconds; on Trial 3, 15 Seconds; and 
so on. To minimize the time required for 
writing while insuring that each rule state- 
ment was well learned, those rules which S 
successfully reproduced twice in a row were 
not reintroduced until all of the others had 
been learned to the same criterion. As a 
final review, S went through all four rules 
until he made no mistakes. In each case, S 
did this on the first (review) trial. 

After the learning criterion had been met, 
S was required to apply each rule to two 
stimulus instances. He was told, 


Each problem will appear together with 
the name of one of the rules you have 
learned. You are to simply apply this 
rule to the problem shown. That is, write 
the correct answer....You may refer to 
any of the information you have already 
seen any time you need it. 


In particular, the four rules were in full 
view of S throughout the interpretation-test 
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period. The order of testing on the eight 
problems was random except that each rule 
was tested once before any other was tested 
a second time. All of the testing was self- 
paced. Finally, S was asked how he remem- 
bered each rule, whether he found any short- 
cuts, and to make any other comments he 
cared to. 

All of the materials were reproduced by 
ditto and S wrote all of his answers—rules 
during the learning and solutions during the 
interpretation test—in the spaces provided. 

In summary, separate measures of learn- 
ing rate and interpretability were obtained. 
Learning rate was determined by present- 
ing each rule statement, for study and test- 
ing on each trial to see if S could correctly 
reproduce them in written form. Number 
of trials to a criterion of two perfect repro- 
duetions in a row was the dependent meas- 
ure. Interpretability was measured by hav- 
ing S apply each of the rules taught to two 
stimulus instances, The interpretability test- 
ing took place immediately after all of the 
statements had been learned. 


RESULTS 


The major results were straight- 
Naa and are summarized in Table 

Those rules, which were stated in 
symbolic form, were applied success- 
fully if and only if S had been taught 
how to apply the constituent symbols 
—assuming mastery of the underly- 
ing grammar (i.e., use of parentheses). 
There were only four of 24 (one per 
S) consistent exceptions, where neither 
problem was solved, to the sufficiency 
part of this generalization and none 
as regards necessity. The four excep- 
lions were spread out over three of 
the four rules. Two other Ss each failed 
to give the correct response to one of 
the two applications problems.* 


*To clarify the underlying logic of this 
result, let m and ~m represent knowledge 
and lack of knowledge of the symbol mean- 
ings, and a and ~a, ability and lack of 
ability to apply the corresponding rule, re- 
spectively. The (near deterministic) suffi- 
ciency result may be represented m > @ 
(read m implies a) while the other demon- 
strated result may be represented ~m > 
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TABLE 2 


LEARNING RATE AND APPLICABILITY OF 
RULE STATEMENTS 
Ay Sa Ine cox uude um sc Pe A 
[Number of trials| Number of 
to learning correct. 
criterion applications 


Li SD M SD 


pcm bos 4.88 63 | 158 | .76 
1. " 
Symbol form. 
4.92 | 1.10 | 0,00 | .00 
Verbal form-symbols 
6.75 1.23 1.83 57 
Verbal form-symbols 
nonmeaningful 7.12 | 2.24 | 1.75 66 


The symbolic rule statements were 
learned more rapidly than the verbal 
statements [F = 64.21, df = 1/23, p < 
.001].8 More important, the symbolic 
statements were easier to learn 
whether or not the symbols were mean- 
ingful (F for the interaction was <A)? 
There were a total of only eight of 48 
(two per S) exceptions to the latter 
two generalizations. In four of the 
eight situations, the symbolic and 
verbal statements were learned at the 
same rate. 

Although they took longer to learn, 
the verbal statements were applied 
equally as well as (in fact, slightly 
better than) those symbolic state- 
ments in which use of the constituent 


~a which is equivalent to the contrapositive 
a > m (which is the necessary condition). 

Also notice that, while each S served as 
his own control, the idiographie comparisons 
above and throughout the results section 
were necessarily averaged over different 
rules. 

*'This result was consistent and reliable 
for all but the Greek rule [¢(11) > 3.0, p < 
01]; in that case the difference was in the 
expected direction but was not significant 
(t < 1). The reason for this exception was 
not immediately clear but it may have been 
due to the relative complexity of the pri- 
mary symbol, 


i=l 


which involved both sub- and superscripts. 
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symbols had previously been mastered. 
Furthermore, these verbal statements 
were learned at approximately the 
Same rate and the underlying rules 
were applied equally well whether or 
not training in the use of the corre- 
sponding mathematical symbols was 
given. Only two of the 24 Ss were un- 
able to apply the verbally stated 
rules when the symbols’ were mean- 
ingful and only three were unable to 
do so when they were not. 

Strong support was again found 
for the consistency hypothesis. In only 
two cases out of 96 (four per S) was 
one of the two application problems 
solved and not the other. 


Discussion 


At first glance, the effects of symbol 

pretraining do not appear to be par- 
ticularly interesting or enlightening. 
Without an assigned referent, the con- 
stituent mathematical symbols were 
basically no different from nonsense 
syllables and could hardly be ex- 
pected to be useful in applying a rule. 
_ On the other hand, the near deter- 
ministic sufficiency data have im- 
portant implications, Mastery of the 
constituent mathematical symbols— 
also assuming mastery of the under- 
lying grammar (Le, use of paren- 
theses) and an appropriate experi- 
mental context—accounted for about 
80% of the experimental outcomes on 
the interpretability tests, Much of the 
remaining 20% can probably be at- 
tributed to mistakes in computation, 
carelessness, and other momentary 
lapses, 

What this means, of course, is that 
general mathematical aptitude and 
achievement measures can be expected 
to account for the ability to interpret 
rules, stated with mathematical sym- 
bols, only to the extent that they 
covary with the sort of interpretive 
prerequisites identified here, Ability 
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X "Treatment interactions, where they 
exist and at least with respect to in- 
terpreting simple mathematical rules, 
can be attributed to very specific pre- 
requisites. On the other hand, while 
viewing the ability-treatment ques- 
tion in this manner could have a pro- 
found effect on how the problem is 
attacked theoretically, it does not 
preclude the possibility that a demon- 
strated interaction, between some gen- 
eral aptitude or achievement measure 
and some mode of instruction, might 
have important practical implications. 
The very fact that most existing 
standardized tests get at a wide va- 
riety of specific skills and abilities 
makes it quite possible that the pre- 
requisites for a given specific inter- 
pretive task will be sampled. None- 
theless, the results of this research 
suggest that better-defined tests may 
be constructed by identifying the na- 
ture of the information to be inter- 
preted, determining and classifying the 
kinds of interpretive prerequisites re- 
quired, and sampling these prerequi- 
sites across classifications to construct, 
the predictive tests. I am not suggesting 
that this would be an easy task, but I 
am suggesting that a rationale-based 
approach to test construction may be a 
better way to deal with the ability- 
treatment question than an approach 
based purely on pragmatics.* 

Why the symbolic tule statements 
were more easily learned than the 
verbal statements is a difficult ques- 
tion to answer definitively. While re- 
sulting in shorter statements, the 


* Although this is not the place for a 
lengthy di ion, similar arguments may be 
made ing test construction in general. 
The research cited in the introduction sug- 
gests that it might be better to work back- 
wards from the criterion task(s) in con- 
structing tests rather than by following the 
more usual empirical procedure of con- 
structing tests and later determining their 
validity. 
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mathematieal symbols were not as 
familiar to the Ss as the corresponding 
verbal symbols. Still, with the ma- 
terials used in this study it would ap- 
pear that relative length was the more 
critical factor. In Miller's (1956) 
classic paper it was shown that man 
can increase his finite and relatively 
constant capacity for processing in- 
formation by recoding information 
into a smaller number of "chunks." 
One likely manifestation of recoding 
is the mathematician’s tendency to 
substitute more compact mathematical 
symbols for longer verbal phrases as 
their use becomes more frequent. Ac- 
cording to this interpretation, learn- 
ing the longer verbal statements prob- 
ably required a substantially greater 
degree of recoding than did the others. 
The shorter symbolic statements, al- 
though composed of less familiar 
mathematical symbols, were effec- 
tively recoded to some degree already. 

Providing S with the meanings of the 
mathematical symbols before he 
learned the corresponding verbal 
statements, however, did not reliably 
increase learning rate. This suggests 
that Ss tended to use their own pre- 
ferred bases for recoding the verbal 
statements rather than the relatively 
unfamiliar bases (ie., mathematical 
symbols) provided.5 

Although the verbal rule statements 
took more time to learn than the sym- 
bolic statements, once learned, they 
were applied equally as well—in fact, 
slightly better. This result provided 
further support for an earlier finding 
by Scandura and Behr (1966). Ap- 
parently, the inherent redundancy of 
English, although resulting in rela- 
tively long rule statements, in no way 


"It may be just this sort of preference 
Which resulted in a conceptual organizer 
aving an unexpected negative effect on 
Cos memory (Scandura & Roughead, 
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interfered with the application of the 
rules. It is possible, of course, that the 
application of particularly cumber- 
some rule statements, even when per- 
fectly memorized, may require abil- 
ities beyond those had by many 
learners. Nonetheless, it may be well 
to remember that ordinary English, 
although less efficient, can be effec- 
tively used to teach precise mathe- 
matical ideas when appropriate 
mathematical symbolism is not im- 
mediately available to the learner. 

While this study clearly does not 
deal with all that is relevant, these 
results do make explicit at least cer- 
tain aspects of the role symbolism 
plays in learning mathematical rules. 
The use of mathematical symbols ap- 
parently makes learning (with “un- 
derstanding”) more efficient when the 
constituent symbols and underlying 
grammar have been previously mas- 
tered. 

An extrapolation of the obtained 
results provides a rational and ex- 
plicit basis for making one type of 
branching decision in instructional 
sequences—the sort of decision rule 
on which the effectiveness of com- 
puter-assisted instruction (CAI) may 
well rest. Given the objective of learn- 
ing a particular rule and an expository 
mode of instruction, one might pro- 
ceed as follows: (a) Test to see is 
can make use of the constituent sym- 
bols, (b) if so, present the rule in the 
more efficient symbolic form, (c) if 
not, present the same statement in 
English. The circumstances under 
which this decision rule can be ap- 
plied, of course, are extremely limit- 
ing. Nonetheless, it may be a step in 
the right direction. Heretofore, al- 
though student feedback has pro- 
vided the basis for all adaptive 
instruetional procedures, it has typi- 
eally been unclear as to what sort of 
feedback to measure. 
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The present results suggest that 
specific sorts of feedback are needed 
in order to make specific kinds of de- 
cisions. If this is true, it is quite pos- 
sible (I feel quite likely) that general 
feedback measures, such as number 
of errors, average latency, ete., will 
play a less and less important role in 
CAI in the years ahead. Equally im- 
portant, the obtained consistency re- 
sults provide further reason to sus- 
pect that appropriate feedback may 
be determined in a more efficient man- 
ner than has been the case to date. 
Even highly efficient testing takes 
time, however, and, whereas testing 
is an integral part of the educational 
process, it must be used judiciously. 
There may be some optimal trade-off 
between the time devoted to testing 
and the time devoted to instruction. 
The role of basic research will be to 
help clarify these issues. 
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Synonymity, antonymity, 


homonymity, 


and word association 


“meaning” were studied independent of one another between train- 
ing and transfer in either the stimulus, response, or both positions 
in paired-associate lists. In training the stimulus, response, or both, 
words (for 3 groups of 20 university Ss) were varied in "meaning" 
in relation to the transfer task in 6 ways: unrelated, synonyms, 


antonyms, homonyms, 


associates, and identical. The experiment. 


was a 3 lists by 6 “meanings” design with between Ss for lists and 
within Ss for meaning. Analyses of variance indicated that there were 
reliable differences in transfer among lists and among meaning con- 
ditions, There was greater transfer for homonyms and associates than 
for synonyms and antonyms. The discussion focused on preference, re- 
sponse-chaining, and representational mediation hypotheses as well 
as similarity of meaning in transfer of learning. 


The fact that meaningful material 
is learned more readily than less or 
nonmeaningful material hardly needs 
documentation today. Meaningfulness 
also appears to be an important fac- 
tor in transfer of learning; however, 
in this respect the “picture” is less 
clear. This is true in spite of the large 
number of research studies on verbal 
transfer since Gibson (1940) pub- 
lished her studies and analyses in 
terms of generalization and differen- 
tiation. In 1949 Osgood proposed a 
theory with three laws which inte- 
grated the various experimental find- 
ings of transfer studies then available 
and predicted additional transfer phe- 
nomena, The result was his well-known 
transfer surface which holds that the 
degree of similarity of stimuli, re- 
sponses, or both, between the training 
and transfer tasks produces varying 
degrees of positive or negative trans- 
fer effect. Later Osgood (1952) indi- 
cated that it was the similarity in 
meaning (not just phonetic, ortho- 
graphic or physical similarity) that 
might be an important factor in medi- 
ses positive or negative transfer ef- 
ects. 

Unfortunately, very few studies 
have been done which have systemat- 


ically varied meaning (either synon- 
ymity, antonymity, or word associa- 
tions) in either the stimulus or 
response elements between paired-as- 
sociates (PA) lists in training and 
transfer. Early studies by Haagen 
(1943), Morgan and Underwood 
(1950), and Underwood (1951) show 
that transfer effects increase to the 
degree that words in the same posi- 
tion between the training and trans- 
fer lists are judged similar in mean- 
ing. However, later it was realized 
that in these studies similarity of 
meaning (synonymity) was con- 
founded with association values be- 
tween the words employed in the 
training and transfer tasks. These 
studies sampled words from Haagen’s 
(1949) lists and he reports that the 
degree of synonymity and association 
value correlated .90 for 80 judges 
rating 400 pairs of adjectives. Thus 
in the studies by Haagen and Under- 
wood transfer could have been a re- 
flection of either association values, 
synonymity, or both. Of course, since 
then Russell and Storms (1955), 
among others, have reported that as- 
sociation values of words are an im- 
portant factor in verbal transfer. 

Tn an effort to unsnarl the semantic 
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from. the associative factors, Bastian 
(1961) designed an experiment in 
which the same words were used as 
stimuli in the training and transfer 
tasks while the response elements 
were varied from training to transfer. 
On the other hand, Ryan (1960) did 
an experiment in which he varied 
stimulus elements in terms of semantic 
or associative factors between training 
and transfer. In both the Ryan and 
Bastian experiments one-third of the 
words were high in semantie meaning 
and low in association value, one- 
third of the words were high in asso- 
ciation value and low in semantic 
meaning, and one-third were control 
words whieh were very low in associa- 
tion value and judged to be very low 
in similarity of meaning. In both the 
Ryan and Bastian experiments, that 
is, whether stimulus or response ele- 
ments were varied from training to 
transfer, the results showed positive 
transfer on most of the measures for 
both associative and semantie word 
types as compared with the controls. 
However, the difference between the 
associative and semantic word types, 
though small, favored the associative. 
Nevertheless, it is difficult to con- 
clude in favor of the word-association 
hypothesis because in Bastian’s study 
10 of the 12 word associates and in 
Ryan’s study six of the nine word as- 
sociates in the transfer list were 
clearly antonyms of the corresponding 
stimulus or response element in the 
training list. Therefore, transfer of 
learning could have resulted from 
word associations, antonyms; or both. 
In an effort to manipulate Systemat- 
ically both stimulus and response 
meaning relationships from training 
to transfer, Wimer (1964) employed 
adjectives originally used by Osgood 
(1946). Twenty-five different PA 
lists in training were constructed in 
which stimuli or responses were ad- 


jectives either identical, similar, un- 
related, opposed, or antonyms to ad- 
jectives in comparable stimulus or 
response positions in the transfer list. 
The design of the experiment was, 
thus, five stimulus conditions com- 
bined factorially with five response 
conditions. With the exception of the 
subjects (Ss) with both stimuli and 
responses identical in training and 
transfer, none of the other 23 groups 
showed a statistieally reliable differ- 
ence from the control group (Ss with 
stimuli and responses unrelated from 
training to transfer). One. difficulty 
that may have beclouded these re- 
sults is that stimulus and response 
elements in the opposed list were not 
opposite in meaning to the common 
transfer list but were opposite or 
really antonyms of the list of words 
deemed similar, for example, HARD- 
soft. In addition, there was a list of 
words that were antonyms of the 
common transfer words, for example, 
TENSE-relaxed. 

With the noted difficulties of the 
previous studies in mind, synonymity, 
antonymity, and associative meaning 
still need to be studied independent 
of one another, but in relation to cer- 
tain control conditions, and in compar- 
able positions in both stimulus and re- 
sponse elements between training and 
transfer in one experiment. The pres- 
ent study attempts to do just that. 


Merxop 
Part I 


Materials and procedure. In an effort to 
select words that were either synonyms, 
antonyms, or primary associates of one 
another in the training and transfer tasks, 
the difficulty was faced of either controlling 
for or manipulating homonymity as a vari- 
able. Because of its possible influence in 
mediating verbal transfer (especially when 
S responds aloud), homonymity was em- 
ployed as a variable. In order to develop the 
materials, a list of 223 pairs of homonyms 
was constructed. Out of this group, 105 pairs 
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of homonyms. were selected for which a 
synonym and an antonym could be readily 
found for at least one of the homonym 
words in a pair. This was done so that 
eventually it would be possible to check 
whether an antonym or synonym would also 
be a word associate. The 105 pairs of homo- 
nyms were then employed in a free-associa- 
tion test. One-half of the pairs of homo- 
nyms were put on one form of the test and 
the other half on another form. In order 
to control for possible sequence effects in 
the order of presenting word associates both 
sets of homonyms were arranged in two 
different sequences. 

Subjects. One of the words of each pair 
of homonyms, using both sequences, was 
administered to 114 upperclassmen in educa- 
tional psychology. Five days later the other 
word of the pair of homonyms, using both 
sequences, was administered to the same 114 
Ss. Approximately one-half of the Ss received 
one set of homonyms the first day and the 
other set the second day, and vice-versa. 
The frequency of word associations to the 
homonyms was tabulated on an IBM 7044 
computer. 


Part II 


Materials. Only a limited number of 
words for the learning lists could be selected 
whose more frequent associates were not a 
synonym, antonym, or homonym. Table 1 
presents the words employed in the training 
and transfer tasks with notation as to which 
of three groups of Ss received what PA lists 
of words in training and transfer. The train- 
ing and transfer lists consisted of 12 pairs 
of words. In the training lists either the 
stimulus, response, or both (for three groups 
of Ss) were varied in “meaning” in relation 
to the common transfer list in six ways. 
Two of the words were unrelated, two were 
synonyms, two were antonyms, two were 
homonyms, two were associates, and two 
were identical words between the training 
and transfer tasks. Even though each train- 
ing list was mixed with regard to “meaning,” 
the lists were unmixed or “like” lists accord- 
ing to Underwood (1964). The words em- 
ployed were limited to two syllables. The 
frequency of occurrence per 10 million words 
from Lorge's unpublished count of words 
from five magazines still being published 
today ranged from 119 for the two antonyms 
to 2077 for the two homonyms used in the 
response elements of the training task. This 
eel difference in a count of 10 million 

rds. 
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TABLE 1 
Wonps EMPLOYED AND THEIR RELATION- 
SHIP IN “MEANING” FOR THE TRAINING 
Tasks AND COMMON TRANSFER TASK 


Stimuli Responses 
Types of 
“meaning” 
1 2 3 4 
Unrelated IDLE DAWN sent accept 
FAINT | CAST morn | vice 
Synonyms MAIL POST. TOWS tiers 
crate | GRinp | tail rear 
Antonyms BALL CUBE damn | bless 
PACKED | empry | hale sick 
Homonyms | STAYED | STAID dear deer 
d GROAN | GROWN | m merry 
Associates MAIN street | eight | nine 
TOE FOOT vane weather 
Identical WEAK WEAK pause | pause 
PEACE | PEACE | Soar soar 


Note.—The group that had stimuli varied learned 
paired lists 1-4 in training and 2-4 in transfer; the group 
paired lists 2-3 in 


Simul? and responses varied learned paired lists 1-3 in 
training and 2-4 in transfer. 


Procedure. Training and transfer lists 
were typed in lower-case letters in the 
center of 3 by 5 inch white index cards. 
Standard PA. instructions. were used prior 
to the training task. The Ss were run in- 
dividually by the anticipation method. The 
Ss were told to say each word out loud. 
Four sequences of the training and transfer 
materials were constructed: an initial expo- 
sure of the S-R pairs and three sequences 
for the learning tasks—to prevent serial 
response learning. S-R pairs in the initial ex- 
posure trial were presented at 2-second 
intervals. In the training and iransfer trials 
stimulus words were presented at S's rate 
of responding but no word was presented for 
more than 5 seconds. The correct S-R pair 
was presented for 2 seconds. In training and 
transfer Ss performed to @ criterion of two 
successive trials without error. (The number 
of trials for each condition of “meaning” 
was constant since & repeated-measures de- 
sign was employed.) Immediately after 
reaching criterion on the training task, S 
was told “now I will show you another set 
of cards, The instructions are the same 88 
for the last set.” 

Design and subjects. The experiment was 
a three lists (stimuli, responses, Or both 
varied from training to iransfer) by six 
types of “meaning” design; with a between- 
Ss for lists and a within-Ss for meaning. 
Sixty Ss were assigned alternately to either 
the stimulus, response, or stimulus and re- 
sponse conditions in training as they ap- 
peared at the laboratory. The Ss were 
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TABLE 2 
Mean NUMBER OF ERRORS IN TRAINING 
Lists 
Types of, 

Stimulus | Response Need 
Unrelated 5.90 6.85 6.30 
Synonym 6.00 4.80 4.75 
Antonym 4.05 5.45 4.45 
Homonym 5.05 4.95 5.05 
Associate 3.15 4.10 4.90 
Identical 5.25 5.00 5.60 
Total 29.40 31.15 31.05 


volunteers from classes in educational psy- 
chology and had not participated in Part I 
of the study. 


RESULTS 
Training 


The mean number of errors over all 
the trials in training are presented in 
Table 2. An analysis of variance for 
repeated measures indicates that there 
are no reliable differences in perform- 
ance among the three lists (F < 1.0; 
df = 2/57). However, there are re- 
liable differences among the types of 
meaning (F = 5.37; df = 5/285; p < 
.01), but the interaction of Lists x 
Meaning (F = 1.20; df = 10/285) is 
not significant. In order to locate where 
the differences among the meaning 
conditions existed, Scheffé’s confidence 
interval procedure was applied to the 
data for each of the three lists (groups 
of Ss) separately; using the within Ss 
mean square error term from a one- 
way analysis of variance for repeated 
measures with alpha = .01. It was 
found, only in the stimulus list, that 
the unrelated and synonym conditions 
had reliably more errors than the as- 
sociation condition. In the response 
and stimulus and response lists there 
were no differences among the means, 
Thus the words in the associate con- 
dition for the stimulus list only were 
easier to learn than those in the unre- 


lated and synonym conditions, even 
though a pilot study showed no differ- 
ences. 


Transfer 


In view of the significant difference 
among meaning conditions for the 
stimulus list in training, rank-order 
correlations were computed between 
over-all errors in training and trans- 
fer for each cell in the six meaning by 
three lists conditions to ascertain 
whether analyses of covariance should 
be employed. The correlations ranged 
from .40 to —.31 with a median value 
of .10. Since only two of the 18 corre- 
lations were significantly different 
from zero at the .05 level, it was de- 
cided that analyses of covariance were 
unwarranted. 

First trial. Table 3 presents the 
mean number of errors on the first 
trial in transfer. An analysis of vari- 
ance showed that the differences 
among the three lists were significant 
(F = 6.20; df = 2/57; p < .01). The 
total errors on the first list were what 
could be expected from transfer 
theory; where only stimuli are varied 
there are fewer errors than where only 
responses are varied, and where re- 
sponses are varied but are similar 
there are fewer errors than where both 
are varied. Also there was a reliable 
difference among the meaning condi- 


TABLE 3 


Meran NUMBER or ERRORS ON THE FIRST 
TRIAL AND ALL TRIALS IN TRANSFER 


First trial All trials 
Types of 

leaning” Stim- Stim- 
Phu: | Ref. lulus ana|Stim- | Res- [ujus and 
response| ponsel response 
Unrelated | .95 | 1.55 | 1.40 | 1.55 |2.55| 2.15 
Synonym | .60| .90| 1.20 | 1:00] 1.45 | 1:85 

tonym | .75|1.05| 1.50 |1.05 |215| 2. 
Homonym | .35| .10| .50 | .45| .35| .55 
iate | .35| .20| .80 | :40| .45| 170 
Identical | .35| .20| <05 | :70| .45| 25 
‘otal 3.35 | 4.00 | 5.15 | 5.15 | 7.40 | 7.50 
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tions (F = 12.97; df = 5/285; p < 
001) and the Lists x Meaning inter- 
action was significant (F = 3.33; df 
= 10/285; p < .001). 

In order to obtain an estimate of 
the degree of positive and negative 
transfer on the first trial each mean 
can be compared with the unrelated 
§ + R mean in Table 3 which is the 
A-B, C-D control for learning-to- 
learn and “warm-up” effects. Without 
going into all the comparisons, it can 
be readily seen that when the words 
in training are homonyms, associates, 
or, of course, identical to their coun- 
terpart in the stimulus, response, or 
stimulus and response conditions on 
the transfer task, there is considerable 
positive transfer effect. 

A one-way analysis of variance with 
repeated measures was applied to the 
stimulus list, and the meaning con- 
dition was significant (F = 3.91; df 
= 5/95; p < .01). However, when 
Scheffé’s confidence-interval procedure 
was used, with the appropriate mean 
square error, none of the mean differ- 
ences were reliable. It appears that 
Scheffé’s test is a conservative one. 
Since Bastian and Ryan found that 
the mean errors were less for the as- 
sociate than the semantic conditions, 
a t test for correlated means with a 
one-tailed test was used. Indeed, the 
difference was significant (¢ = 2.60; 
df = 19; p < .05). 

For the response list an analysis of 
variance for repeated measures Was, 
again, employed, and the meaning 
condition was significant (F = 27.80; 
df = 5/95; p < .001). Scheffé’s pro- 
Cedure showed that the synonym, 
homonym, associate, and identical 
Conditions were different from the un- 
telated. In addition, the homonyms, 
associates, and identicals are different 
from the antonyms as well as the 
synonyms. 

For the stimulus + response list 
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the analysis of variance for meaning 
was, again, significant (F = 17.32; df 
= 5/95; p < .001). Scheffé’s procedure 
indicated that the homonym, associ- 
ate and identical conditions were 
reliably different from both the un- 
related and antonym conditions. Fur- 
thermore, the identical was different 
from the synonym condition. 

All trials. Table 3 also presents the 
mean number of errors for all the 
trials in transfer. Employing the same 
analyses to these data as for the first 
trial, it was found that there were 
reliable differences for lists (F = 3.40; 
df = 2/57; p < .05). These results 
indicate that there were fewer errors 
on the stimulus than the other two 
lists. Also, the meaning conditions (F 
= 28.82; df = 5/285; p < .001) and 
the interaction of List x Meaning 
was significant (F = 2.12; df = 
10/285; p < .05). An indication of 
the amount and direction of specific 
transfer effects can be had by com- 
paring each mean with the A-B, C-D 
control for general transfer effects. In 
general, it can be seen that when the 
words in training are homonyms, as- 
sociates, or identical to the correspond- 
ing words in the stimulus, response, 
or stimulus and response conditions 
in the transfer task there is a great 
deal of positive transfer. 

For the stimulus condition a re- 
peated measures analysis of variance 
indicated that meaning was signifi- 
cant (F = 3.75; df = 5/95; p < 01). 
When Scheffé’s procedure was em- 
ployed, however, none of the mean 
differences was reliable. Again, since 
Bastian and Ryan found that the 
mean errors were larger for the se- 
mantic than the associates, a one- 
tailed ¢ test for correlated groups was 
computed and found to be significant 
(t = 2.60; df = 19; p < 01). 

An analysis of variance with re- 
peated measures was applied to the 
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response list, and there were reliable 
differences among the means (F — 
13.92; df = 5/95; p « .001). Scheffé's 
procedure showed that the homonyms, 
associates, and identical words were 
reliably different from the unrelated 
and antonym words. . 

For the stimulus + response list 
the analysis of variance showed that 
the meaning conditions were again 
significant (F = 14.39; df = 5/95; p 
« .001). Scheffé’s procedure indicated 
that the homonyms, associates, and 
identical words were reliably different 
from the unrelated and antonym 
words. In addition, the homonyms and 
identicals were different from the 
synonyms. 


Discussion 


As far as “meaning” is concerned, 
the results of the experiment lead to 
the conclusion that in paired-associate 
learning when the relationship be- 
tween training and transfer words are 
associates or homonyms there is con- 
siderably more transfer of learning 
than when the relationship is one of 
antonyms or synonyms, Furthermore, 
the associates and homonyms appear 
to have the same effect as practicing 
the identical words from training to 
transfer. This seems to be the case 
whether the stimulus, response, or 
both are varied. The fact that there 
was greater transfer for words with 
associative meaning than synonym 
(semantic) meaning where either 
stimuli or responses are varied con- 
firms the previous findings by Bastian 
(1961) and Ryan (1960). However, 
since Bastian and Ryan confounded 
word association with antonymity the 
results of the present experiment indi- 
cate that they were correct in inferring 
that there was more transfer for words 
having associative connections than 
semantic meaning. This is borne out 
by the present study in which there 
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were no reliable differences in transfer 
between the synonym and antonym 
conditions, but there were reliable dif- 
ferences between the antonym and 
associate conditions. 

Of course, the results obtained in 
this experiment may not occur when 
homogeneous or unmixed lists for 
meaning are employed, that is, where 
six separate groups learn six separate 
lists followed by a common transfer 
task. Underwood (1964) suggests that 
in mixed lists S may have a “prefer- 
ence" for starting with certain items 
and will learn these first, Whereas a 
list homogeneous for the same kind 
of words may not be learned faster 
than a list homogeneous for other 
kinds of words. There was no “prefer- 
ence" for any of the types of ‘‘mean- 
ing” words in the three training tasks, 
with the exception of associates over 
the unrelated and synonyms words in 
the stimulus-varied list. Therefore, 
any preference in the transfer task 
would have to be based upon meaning 
relationships between training and 
transfer. One interpretation is that 
with mixed lists differential transfer 
may be based upon preference, particu- 
larly preference for associates and 
homonyms. 

Perhaps the results of the present 
experiment also can be interpreted as 
favoring the associative, response 
chaining, mediation theory espoused 
by the Minnesota researchers (Jen- 
kins, 1963; Russell & Storms, 1955) 
who have employed free-word-asso- 
ciation procedures from which media- 
tion processes can be inferred in trans- 
fer. On the other hand, some may wish 
to interpret the results as not favoring 
the representational mediation theory 
of Osgood (1952), who has advocated 
the use of the semantic differential for 
the assessment of mediators. How- 
ever, the semantic differential is prob- 
ably not measuring semantic (syno- 


MEANING AND TRANSFER OF VERBAL LEARNING - 


nym, antonym) or lexical meaning but 
connotative meaning. Therefore, since 
there was less transfer for the syno- 
nym and antonym than the word-as- 
sociation conditions in the present ex- 
periment, these results do not appear 
io be contradictory to Osgood's rep- 
resentational mediation theory. As & 
matter of fact, some psychologists 
(Bousfield, 1961; DeBurger & Dona- 
hoe, 1965; Noble, 1963) have indi- 
cated that associative meaning and 
connotative meaning, as measured on 
the semantic differential, are highly 
correlated, at least for some words. 
Osgood (1961) has presented evidence 
that S’s marks on the scales of the 
semantic differential were not made 
only on the basis of word-association 
tendencies. 

A curious finding in the present 
study is that there was more transfer 
for the homonym than the synonym 
condition. This result is contrary to 
those obtained by Razran (1939) and 
Riess (1946), who found more general- 
ization among synonyms than homo- 
nyms for adult Ss. Perhaps some of 
this discrepancy can be attributed to 
the methods employed. Razran and 
Riess used a classical, semantic-con- 
ditioning paradigm whereas the pres- 
ent study employed an instrumental 
learning paradigm. Further, the fact 
that Ss pronounced all words may 
have served to make the homonyms 
function somewhat like identical 
words, Also, different words were em- 
ployed. 

With regard to Osgood's (1949) 
transfer model and the question of 
Similarity or, particularly, the op- 
Posed relationship between training 
and transfer words, a logically and 
semantically confusing point should 
be clarified. Some psychologists (e.g., 
Wimer, 1964) have apparently thought 
that semantic similarity could be 
scaled from identical, to similar, to 
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unrelated, to opposed, to antonyms. 
Words judged to be opposed or anto- 
nyms of other words must be se- 
mantically related, otherwise opposi- 
tion or antonymity could not be 
ascertained. Consequently, opposite or 
antonym words must fall on a se- 
mantic scale somewhere between being 
identical and unrelated rather than at 
points beyond that of unrelatedness. 
If they were semantically beyond the 
point of unrelatedness, negative trans- 
fer should occur. However, instead of 
showing negative or even no transfer, 
when antonyms were varied in either 
the stimulus, response, or both condi- 
tions in the present experiment there 
was some positive transfer when com- 
pared to the corresponding unrelated 
condition for all three groups of Ss. 
These results place antonyms on a 
scale between identical and unrelated 
words. 

The results obtained under the con- 
ditions of the present study should 
not be inferred to suggest that edu- 
cators should stress the teaching of 
homonym words or mere associations 
to words as opposed to the semantic 
aspects of words. On the contrary, the 
semantic aspects of words are proba- 
bly far more important, educationally 
speaking. Synonyms, antonyms, and 
other semantic aspects of words are 
often learned in a context (syntax) 
and have a transfer value in a context. 
Whereas synonyms and antonyms out 
of context, as in the present experi- 
ment, have some transfer value, it is 
far less than word associates and 
homonyms out of context. 
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READING RATE AND IMMEDIATE VERSUS 
DELAYED RETENTION 


STANTON P. THALBERG 
University of Washington 


The purpose of this study was to determine whether the relationship 
between reading rate and retention is invariable over time. 176 college 
Ss were randomly assigned to 1 of 2 conditions of retention. Ss in 
the immediate treatment (IR) were tested immediately after reading 
a 1500-word passage. Ss in the delayed treatment (DR) read the same 


passage but were tested 24 hrs. later. Ss were 


subdivided within 


treatments into fast-, average-, and slow-rate groups. Results indicated 
that under IR conditions slow readers retained significantly more than 
both average and fast readers. Under conditions of DR, retention 


differences between rate-groups 


disappeared. Implications are that 


while more efficient readers remember fewer of the details in a mes- 


sage immediately following the reading than do their slower counter- 
parts, these details extinguish for both groups equally within 24 hrs. 


Studies concerned with the rela- 
tionship between reading rate and 
comprehension date back to Ro- 
manes’ (1884) research conducted 
over eight decades ago. Time has 
apparently neither detracted from the 
importance of considering this rela- 
tionship nor relegated it to an obscure 
position in educational research, de- 
spite the relative tapering off of 
research bulk during the 1960s. The 
continuance of investigation is due at 
least in part to the divergent results 
already obtained. The relationships 
between rate and comprehension, as 
reported in the professional litera- 
ture, range from moderately negative 
(r = —.47) to highly positive (r = 
98) with the great majority of stud- 
ies yielding a. negligible or slightly 
positive correlation coefficient. 
Numerous explanations for these 
discrepant results have been postu- 
lated and systematically included 
into the body of professional knowl- 
edge. These explanations have in- 
cluded the ‘nature of the reading 
material (Thurstone, 1944), the diffi- 
culty of the reading material (Tinker, 
1939), the purpose and set of the 
reader (Carlson, 1949; Shores & 
Husbands, 1950), the intellectual 


level of the reader (Carlson, 1949), 
and the techniques employed in 
measuring both the reading rate and 
level of comprehension (Preston & 
Botel, 1951; Stroud, 1942, 1956; 
Thalberg, 1959). The impact of these 
studies has resulted in a tempering 
of judgment and qualifying of pro- 
nouncements regarding the predicta- 
bility of either rate or comprehension 
from mere knowledge of the other. 

Yet, despite the impressive accum- 
ulation of research, it is neither ex- 
haustive nor plethoric. For example, 
all preceding research has assessed 
comprehension and retention im- 
mediately following the reading, the 
apparent premise being that what is 
true of the relationship under condi- 
tions of immediate recall holds 
equally well if an extended period of 
time intervenes between reading and 
the measurement of retention. 

Early investigators of the relation- 
ship between rate of learning and re- 
tention tended toward a practice-equal 
learning-variable paradigm (Gates, 
1918; Gillette, 1936; Thorndike, 1910) 
with the understandable result that 
fast learners retained more than slow 
learners. 

More recent research, however, has 
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tended toward a learning-constant/ 
practice-variable model in which 
learning to a given criterion is the 
independent variable and the num- 
ber of trials to achieve this level of 
performance is the dependent varia- 
ble (Stroud & Schoer, 1959; Under- 
wood, 1954). Under these conditions, 
the relationship between rate of 
learning and retention is no more 
than slight. 

Since the practice of encouraging 
students to increase their reading 
rate persists, it was appropriate to 
view the rapid-reading phenomenon 
from a slightly different vantage 
point. 

The current emphasis upon rate 
encourages students to read more 
rapidly and promises no loss in 
immediate retention; however, this in- 
crement in speed may facilitate, in- 
hibit, or bear no relationship to long- 
term retention. The sole provision of 
immediate rewards for correct re- 
sponses and the exclusion of delayed 
reinforcements may, in fact, be con- 
ditioning the students to the persua- 
sion that no additional benefits will 
accrue if the information is retained 
beyond the immediate test period. 

This study, then, seeks informa- 
tion relative to the impact of reading 
rate on immediate and delayed reten- 
tion, In brief, answers to two ques- 
tions were sought: Does reading rate 
differentially affect the magnitude of 
immediate and delayed retention? Is 
the relationship between rate and 
comprehension invariable and there- 
fore independent of the time inter- 
val between reading and the measure 
of retention? 


Merxop 


The experimental subjects (Ss) consisted 
of 176 college undergraduates enrolled in 13 
sections of freshman rhetoric (freshman 
English) at the University of Iowa. These 
participants were randomly assigned by in- 
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tact class sections to one of two experimen- 
tal treatment groups. The Ss were further 
subdivided within treatments into three rate 
levels. Certain Ss (not included in the 176) 
were excluded from the study for the fol- 
lowing reasons: 19 for whom American Col- 
lege Test (ACT) scores were not available, 
11 who had incomplete response data, and 9 
who were eliminated randomly to effect 
proportionality between cells of the experi- 
mental design. 

The Traxler High School Reading Test, 
Form B, Part 1 (Traxler, 1939) was em- 
ployed as the measure of both reading rate 
and retention. Despite the fact that the test 
was not primarily intended for use at the 
college level, it was utilized in this investi- 
gation for the following reasons. First, it 
consists of a reading passage of approxi- 
mately 1500 words, a length sufficient for a 
relatively stable measure of reading rate. 
Second, although the test deals with a rela- 
tively sophisticated and technical topic (the 
historical development of modern-day cal- 
endars), its readability is at a standard 
level, that is, written between an eighth and 
ninth-grade level as determined by both the 
Dale-Chall (1948) readability formula and 
the Flesch Reading Ease Score (1948). 
Third, the test provides a means for meas- 
uring both rate and comprehension over the 
same material, Fourth, a larger number of 
questions (20) is provided than is true of 
other ‘comparable tests, thereby providing 
a more stable and reliable measure of read- 
ing comprehension. Finally, the test is amen- 
able to an amount-constant assessment. of 
rate rather than a time-constant measure. 

The directions for administering the test 
were adapted from the modified time-con- 
stant measure of rate recommended by the 
publishers to an amount-constant approach 
in order to assure ai more reliable and rep- 
resentative estimate of rate and a more 
factual measure of comprehension. The fol- 
lowing specific directions were read to Ss: 

This is a test of your overall reading 
ability. The test contains a story for you 
to read and some exercises for you to 
work after you have read the story. Read 
the article through once for the primary 
purposes of learning and retaining as 
much as possible of its content. You will 
be asked questions over the article. Read 
rapidly but stress your understanding. 

Remember, you cannot work the exercises 

over the content unless you know and re- 

member what you have read. 

In determining their rate of reading, Ss 
were told that immediately upon finishing 
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the reading they were to record the number 
of elapsed seconds—taken from the chalk 
board—on their answer sheets. Ss in the 
Immediate Recall (IR) treatment-group 
were instructed to answer the comprehen- 
sion questions immediately after recording 
their.reading time. Ss in the Delayed Re- 
tention (DR) treatment-group were in- 
structed to close their test booklets, record 
their reading time, and await further instruc- 
tions. Within this latter group (after all. Ss 
had completed the reading) the materials 
were collected and the participants were 
dismissed. The following day the materials 
were redistributed and each S was required 
to respond to the questions without further 
reference to the article. 

Within the two treatment groups Ss were 
subdivided into three levels based upon 
their performance on the rate aspect of the 
test. A high reading group was composed of 
Ss reading 300 or more words per minute, 
an average reading group contained Ss read- 
ing between 240 and 299 words per minute, 
and a low-rate group was composed of Ss 
reading 239 or fewer words per minute. The 
final constitution of these three groups by 
treatments is reported in Table 1. None of 
the differences between mean rates within 
levels was significant at the .05 level. 

ACT scores were available on each of 
the 176 Ss, The means range from 45.41 to 
51.16, and none of the differences between 
subgroups was statistically significant at the 
05 level. Consequently, any possible impact 
of verbal ability upon the criterion meas- 
ures was argued to be either nonexistent or 
minimal. 


RESULTS AND DISCUSSION 


, Mean scores and standard devia- 
tions for each of the three experi- 
mental subgroups on both measures 
of retention are reported in Table 2. 


TABLE 1 
TABLE or RATE MEANS BY LEVELS AND 
RETENTION CONDITIONS 


N Mean SD 

Immediate retention 

Fast rate GO | sznae | 55.95 

Average rate 25 | 252.76 9.42 
plo rate 30 | 207.90 | 20.62 

layed retention (96) 

Fast rate 30 | 340.86 | 66.35 

Average rate 30 | 255.40 | 11.44 

Slow rate 36 | 215.44 | 17.85 
Kp drew lest niga s c ar 
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TABLE 2 
RETENTION MEANS AND STANDARD Devia- 
TIONS BY RATE LEVELS AND RETENTION 
CONDITIONS 


N Mean SD 


Immediate retention (80) 
Fast rate 25 11.28 2.85 
Average rate 25 12.56 2.04 
Slow rate 30 13.66 1.99 
Delayed retention (96) 
Fast rate 30 10.40 2.47 
Average rate 30 9. 1.98 
Slow rate 36 10.91 2.20 


Retention scores for the compre- 
hension task were submitted to a 
two-dimensional (Rate of Reading X 
Retention Condition) treatment-by- 
levels analysis of variance (Lind- 
quist, 1953). The main effects of 
retention and rate were both signifi- 
cant, as was the interaction effect. 
TR scores were significantly superior 
to those of DR, the former condition 
yielding a mean of 12.57 and the 
latter a mean of 10.41 (F = 38.35, 
p < .001). These findings are con- 
sistent with the principles of retro- 
active and proactive inhibition. 

Of greater interest than the su- 
periority of the IR condition over DR 
is that when collapsed across treat- 
ments the three rate levels are dis- 
tinguishable by level of comprehen- 
sion (F = 6.08, p < .01). The mean 
retention score of the high-rate group 
was 10.80, the average-rate group 
11.07, and the low-rate group 12.16. 
The consistency of the inverse rela- 
tionship was somewhat surprising in 
view of the voluminous studies which 
suggest that the rela ionship between 
rate and comprehension is virtually 
nil. In this regard the low-rate group 
was found to be significantly higher 
than both the average and the high 
groups (t = —2.59, df = 1/64; t = 
—3.23, df = 1/53; respectively) at the 
.01 level. 

Because of the significant interac- 
tion effect (F = 308, p < 05), a 
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TABLE 3 
CORRELATION MATRIX FOR ABILITY, RATE, 
AND RETENTION 


ACT Rate Comp 
ACT —.084 .422* 
Rate —.042 — .385* 
Comp .288* | —.193* 


Note.—Correlations above diagonal are 
immediate retention (N = 80); below are 
delayed (N — 96). 

* p < .05. 


subsequent analysis of treatments- 
within-rate-levels employing a simple 
randomized design was conducted 
and indicated a significant treatment 
effect; (p « .01) within the average- 
and low-rate groups. No statistically 
significant differences were forthcom- 
ing within the high group. In other 
words, average and slow readers, 
while ostensibly comprehending more 
immediately after reading than the 
fast readers, showed a relatively high 
rate of deterioration of their compre- 
hension over time. On the other 
hand, the rapid readers, although 
observably less than other compari- 
son groups initially, found their ini- 
tial level of comprehension virtually 
intact over time: They lost only .88 
questions between immediate and de- 
layed trials, whereas the average 
group lost 2.73 and the low group 
2.75 questions. 

The experimental design was 
broken down further to allow for an 
analysis of differences betweeen rate 
groups within retention conditions. 
Within the IR treatment the low-rate 
group was significantly higher in un- 
derstanding than both the high- (F = 
13.30, p < .01) and the average- 
rate groups (F = 4.12, p « .05). Un- 
der the DR treatment, however, the 
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low-rate group maintained its su- 
periority over only the average group 
(F = 4.27, p < .05), the difference 
between the low and the high groups 
apparently dissipating with time. 

Two alternatives present them- 
selves as possible explanations of the 
finding that retention differences be- 
tween readers of varying rates, 
though clearly favoring the slow 
readers immediately following the 
period of acquisition, wash out with 
time. 

On the one hand, it can be postu- 
lated that the slow reader receives 
more vivid sensory impressions of the 
material because of the doting nature 
of his reading, which, in turn, allows 
him to accurately feedback the ma- 
terial immediately. However, the ad- 
vantage gained initially through this 
increased clarity of the sensory im- 
pressions rapidly diminishes—proba- 
bly as a result of retroactive inhibi- 
tion or disuse—so that the temporal 
enhancement almost totally disap- 
pears after twenty-four hours or less. 
No theory has yet been advanced 
which postulates that vivid sensory 
impressions simultaneously lead an 
individual toward more concrete and 
functional initial understandings and 
to fewer, less effective understandings 
with the passage of time. Indeed, 
clinical experiences suggest the op- 
posite conclusion: The more vivid a 
sensory experience, the longer it is 
retained. 

On the other hand, it may be postu- 
lated that the slow reader—by virtue 
of more deliberate attempts to under- 
stand—is more attentive to minor 
points and details within the message, 
while the faster reader is more con- 
cerned with the organization of these 
details into meaningful generaliza- 
tions. After a 24-hour intervening 
period, however, these details de- 
teriorate for all readers (regardless of 
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rate) and only the more generalized 
concepts remain. These generaliza- 
tions apparently are well-retained by 
readers of all rates. 

Although no analysis was made of 
the responses of Ss of various read- 
ing rates to different types of com- 
prehension questions (detailed or 
generalized), the investigator did in- 
formally examine Ss’ responses with 
this in mind. The Ss with a slow or 
average rate tended to perform better 
than fast readers on questions in- 
volving details during IR; yet during 
the DR condition all three classes of 
Ss tended more toward correct re- 
sponses on the general-type ques- 
tions and toward more incorrect 
responses on questions dealing with 
details. 

Table 3 presents the correlation 
matrix pertinent to the constancy 
over time of the relationship be- 
tween rate and comprehension. Al- 
though the coefficients of —.385 un- 
der the immediate condition and 
—.193 under the delayed appeared to 
differ sufficiently for statistical sig- 
nificance, analysis by the z' Trans- 
formation (Edwards, 1954) indicated 
that the discrepancy was not large 
enough to reach significance (p > 
15). Consequently, one is led to the 
conclusion that the relationship be- 
tween rate and comprehension is rel- 
atively invariant. 

The relationship between ability 
(as measured by the ACT) and rate 
and the relationship between ability 
and comprehension under both reten- 
tion conditions were also investi- 
gated. As can be seen in Table 3, no 
apparent relationship between ACT 
and rate under either condition was 
noted, the correlation coefficients be- 
ing —.034 and —.042 under im- 
Mediate and delayed trials, respec- 
tively. The relationship between 
ACT and comprehension was sta- 
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tistically significant (p < .05), the 
coefficients being .422 for immediate 
and .388 for delayed retention. In 
other words, these relationships sug- 
gest that reading rate is independent 
of ability and is apparently more of 
a mechanical or perceptual nature 
than of an intellectual one. Con- 
versely, however, ability to compre- 
hend and retain verbal material is, at 
least in part, a function of in- 
tellectual ability. 

If one can generalize from the re- 
sults of this investigation to their ed- 
ucational implications, the relative 
efficiency of the faster reader over 
that of the slow or average reader 
maintains itself over time. That is to 
say, although the more efficient 
reader may not comprehend as much 
minutiae immediately following his 
reading than either the average or the 
slower readers, he will in the long run 
have as much functional information 
available to him from his reading as 
either of the other levels. Further, the 
less efficient reader appears to forget 
more of what he reads over time than 
the more efficient reader. This seems 
to imply that reading improvement 
courses which stress rate and compre- 
hension equally will not detrimen- 
tally influence the retention of their 
clientele. 
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2 investigations of "cheating" were conducted; one at a small liberal 
arts college (N = 165), and the other at a large metropolitan univer- 
sity (N — 533). Anonymous questionnaires were administered to rep- 
resentative samples of the 2 student bodies, and relationships be- 
tween the extent of admitted cheating behavior, estimates of the 
amount of cheating within the college or university, and attitudes 
toward cheating were determined. The group of individuals who 
classified themselves as cheaters tended to give higher estimates of 
the extent of cheating by others than: did noncheaters, These same 
individuals tended to be less condemning of cheating, and to explain 
cheating as being due to environmental pressures. The noncheating 
group described the cheater as an individual having a basic personality 
defect. (Reservations concerning this conclusion were made.) Of 3 
classes of situations thought conducive to cheating—adverse physical 
conditions in the classroom, inadequate tests and testing procedures, 
and instructor failings—instructor failings were considered most re- 
sponsible for cheating by both the cheater and noncheater groups. 
(Tested only on School A students.) Differences between cheaters and 
noncheaters were observed on both personal and demographic levels. 


Indiana University 


Alternative explanations of the findings were considered. 


Why do students cheat on exami- 
nations? To gain a better grade? 
Perhaps. But not many students put 
it quite this way. Cheating is thought 
by a sizeable number of students 
(12% of one student body), to be 
just “playing the game with the 
professor.” However, the authors 
have found student responses that 
ranged all the way from extreme 
condemnation of the practice 
(“cheaters should be put in jail as 
common thiefs”) to views expressing 
the belief that cheating is praise- 
worthy (“successful cheating gives 
one a pride of accomplishment”). 

. This report describes two studies 


.'" Many persons gave unstintingly of their 
time in completing various phases of the two 
studies reported here. The late Franklin 
Fearing served as a constant and thorough 
critic. Stanley Kegel helped in the collection 
and coding of data, and Bernice Eisman in 

e analysis, At both schools, service or- 
ganizations truly proved to be of service. 
The administrations of both schools gave 
encouragement, while following a strictly 

“hands-off” policy. 


of cheating. Both studies were in- 
stigated by administration, faculty, 
and student groups, suddenly 
alarmed at the "cheating situation" 
on their campuses. The student- 
faculty-administration promoters of 
these studies had no theoretical in- 
terests, but the present writers tried 
to reap a theoretical by-product as a 
dividend for the work involved. 

The hypotheses for both studies 
were based on what has been called 
“social-perceptual ^ theory." Else 
Frenkel-Brunswik (1951), for ex- 
ample, in a discussion of the kinds of 
statements people make in reference 
to their own shortcomings, points out 
that to project one’s shortcomings 
onto the environment is a comforta- 
ble way to deal with what would 
otherwise be- unflattering self- 
thoughts, for one thereby remains un- 
perceptive of one’s failings. As Fren- 
kel-Brunswik describes it, “we do not 
always see ourselves as we are, but 
instead preceive the environment in 
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terms of our own needs. Self-percep- 
tion and perception of the environ- 
ment actually merge in the service of 
these needs [1951, p. 379]." 

If to label oneself a “cheater” is an 
unflattering self-thought, the kinds of 
dynamics Frenkel-Brunswik speaks 
of would be predicted to come into 
play. This was the orientation that 
guided the derivation of the hypothe- 
ses for test. However, the findings to 
be reported can as well be explained 
within the bounds of another theo- 
retical framework; and since this un- 
anticipated mode of explanation is 
postdictive, discussion of it is saved 
for the last part of the paper. 

The two studies here described are 
essentially replications of one an- 
other, Replication was thought to be 
of more than usual value because of 
wide differences between the two 
schools studied. School A is a small, 
4-year, liberal arts college in a small 
town in a sparsely populated area. 
School B is a large metropolitan uni- 
versity located in a densely popu- 
lated area and has a heavy graduate 
program. Because of these differences 
between the schools, the generality of 
any results supported by both stud- 
ies should be broad. 

Throughout, it has been assumed 
that honesty is a continuously dis- 
tributed variable that is defined in 
terms of the situation as much as by 
the specific act. But in order to 
avoid an involved style, the authors 
will use the terms “cheater” and 
“noncheater,” even though to do so 
implies both a nonexistent dichotomy 
&nd a degree of absoluteness that is 
unwarranted. 


METHOD 
The School A Study 


A questionnaire (see ADI document A)! 
was developed to elicit student judgments 


? Material supplementary to this article 


about the cheating behavior of themselves 
and others, their attitudes toward cheating, 
their opinions about the "cheating situa- 
tion" at School A, and personal data. 

The sample was 10% of the regular under- 
graduate, full-time student population. The 
seventh (chance determined), seventeenth, 
etc. students in the registrar's alphabetical 
listing were chosen, and the necessary locat- 
ing information recorded. These students 
received a letter requesting that they pre- 
sent themselves at & central polling station 
to give their opinions on the "cheating situ- 
ation” at School A. The student was prom- 
ised complete anonymity by direct state- 
ments in the letter, in the student newspaper, 
in the questionnaire, and by suggestion. For 
example, his letter was signed by the stu- 
dent body president, indicating student 
backing; the polls were manned by mem- 
bers of a school service organization, and no 
faculty or administrative person had contact 
with the respondents. At the central polling 
Station, the student was told that when 
completed he was to fold his questionnaire 
and drop it in the locked ballot box pro- 
vided. He completed the questionnaire at a 
screened table in comfort and privacy. 

During the first week 40% of the sample 
responded. Nonrespondents were then tele- 
phoned or sent a post card, and another 
30% responded, Finally, members of the 
Service organization were given the names 
of the remaining nonrespondents along with 
their class schedules. These persons were 
personally contacted and were requested to 
fill out the “cheating questionnaire.” With 
this the final N became 165, or 89% of the 
sample of 186. 

Checks on sample representativeness (in 
terms of the sample proportion of fresh- 
men, sophomores, etc., compared with these 
Proportions as recorded by the registrar), 
showed all differences between sample and 
universe to be insignificant (all p values 
were in the .90’s). 


The School B Study 


This questionnaire is available as ADI 
document B. Five percent of the full time 


has been deposited with the American Doc- 
umentation Institute. Order Document No. 
9632 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library 
of Congress, Washington, D. C. 20540. Re- 
mit in advance $3.75 for microfilm or $2.50 
for photocopies and make checks payable to: 
Chief, Photoduplication Service, Library of 
Congress. 
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undergraduate and graduate student body 
made up the sample. After three contacts, 
when necessary, and including a personal 
contact the third time, a sample of 533 (a 
90% return) was achieved. Representative- 
ness was checked as in the first study, and 
very close universe/sample proportions were 
noted. Time of reporting—that is, after one, 
two, or three requests—was not significantly 
related to whether or not the student ad- 
mitted cheating, or expressed condemning 
or condoning attitudes toward cheating 
(p > 20 and p > 30 respectively). 


RESULTS 


Hypothesis 1 


Individuals who deviate from 
group prescribed norms will differ 
significantly from those who do not 
deviate in giving higher estimates of 
the number of individuals who deviate 
as they do. In the present instance, 
the cheater will perceive more cheating 
going on about him than will the non- 
cheater. 

School A study. One variable was 
the subject's (S's) reply to a ques- 
tion asking: “In your opinion, about 
what percent of [School A] students 
cheat fairly regularly?” Since S 
could have no objective basis for his 
judgment, his response was regarded 
as reflective of his personal outlook. 
As expected of a projective item, this 
one yielded an extremely wide range 
of responses (from less than 10% to 
over 90% ; median = 23%). 

The second variable was S’s state- 
ment as to the number of situations 
in which he had cheated. These situ- 
ations described “rationalizations” 
for cheating. An example is: “I have 
cheated when so many others were 
doing it that one had to do so in 
order to have a fair chance in com- 
peting for grades." The S was in- 
structed to indicate those statements 
that applied to himself, There were 
eight rationalizations, and “cheating 
scores” could therefore range be- 
tween zero (cheating denied in all 
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situations), and eight (the respondent 
admitted cheating in every situation 
described). 

The School A data tend to confirm 
Hypothesis 1: contingency coeffi- 
cient (C) = .36, N = 161, p < .02, 
4 x 5 table. (See ADI table A.)? 

School B study. The projection 
variable in the second study was 
measured in terms of responses to 
this question: 

In your opinion about what proportion of 
students at [School B] frequently cheat? 
(Check one). 

(a) practically all of them 
eo) capac of them (ie. about 75% of 


m) 
(c) about half of them 
——— (ae d of them (ie., about 25% of 


m) 
(e) practically none of them 


A “high-perceiver” group, a “low- 
perceiver” group, and a “medium- 
perceiver” group were defined as 
those who checked (a) or (b), (d) 
or (e), or (c), respectively. 

The Ss were also asked to indicate 
their cheating behavior in terms of a 
three-fold breakdown: “active cheat- 
ing,” “passive cheating,” and “non- 
cheating.” The questionnaire defined 
active cheating as “cheating in which 
the person benefits by such an act,” 
(eg., receiving unauthorized infor- 
mation). Passive cheating was de- 
fined as “cheating in which another 


*In lieu of tabularly presented data, the 
authors report the contingency coefficient 
(C), and also N, p, and the number of cate- 
gories (Row X Column) in the table under 
consideration. This last, of course, deter- 
mines the theoretical upper limit of the 
contingency coefficient; a value ranging from 
the mid .70s through the 80s for tables of 
the size employed in this study. 

The median value of a sample of 112 
correlation ratios (eta) derived from studies 
reported in recent APA journals was .42, 
range 05-92 (Dunnette, 1966). These cor- 
relation ratios (derived from the t and F 
values reported by the authors involved) 
have a theoretical upper limit of 1.0. 
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individual benefits by the act," (e.g., 
giving. unauthorized information to 
another). Those indicating they had 
cheated only actively, or both ac- 
tively and passively, were designated 
“active cheaters” (N = 252). Those 
who stated that they had not cheated 
at School B were designated "non- 
cheaters” (N = 185). Except for nine 
nonrespondents to the cheating ques- 
tion, the remainder (N = 87), 
claimed to have cheated only pas- 
sively. 

The hypothesis that active, pas- 
sive, and noncheaters perceive signifi- 
cantly different amounts of cheating 
in the university was tested by means 
of chi square. The hypothesis tended 
to be confirmed with C = .29, N = 
524, p < .001, 3 x 3 table. (See ADI 
Table B.) 


Hypothesis 2 


Individuals who deviate from 
group norms will be less condemna- 
tory of this behavior than those who 
do not deviate. The cheater, in con- 
demning cheating less strongly than 
the noncheater, thereby perceives 
others to be less condemning of him 
and he feels less subject to con- 
formity-producing pressures. 

‘School A study. The S's cheating 
score was compared with his atti- 
tude score, which was derived in the 
usual manner for Likert scales, and 
so that those who condoned cheat- 
ing most earned the highest scores. 
The School A data tend to confirm 
Hypothesis 2 with C = ae ae 
165, p < .001, 3 x 5 table (ADI 
Table C). 

School B study. The 10 attitude 
statements used in the School B study 
were not scaled, and composite atti- 
tude scores were therefore unayaila- 
ble. Consequently, Hypothesis 2 was 
tested by separately comparing the 
different cheating group responses to 


James Q- KNOWLTON AND Leo A. HAMERLYNCK 


the following four items, one at a 
time. ie t 
1. “Cheating is basically as immoral as 
ina” 


2. “As an ideal, honesty in examinations 
deserves my support; but students are hu- 
man and cannot be expected at present to 
attain this ideal.” 

3. “Cheating reveals a basic defect in the 
character of the student.” 

4. “Cheating is basically wrong but is 
sometimes necessary when too much empha- 
sis is placed upon grades.” 


These particular items were se- 
lected as potentially good discrimi- 
nators because the percentage of all 
respondents in agreement with them 
ranged closely about the 50% mark 
(46%-56%). The other six state- 
ments were agreed to by approxi- 
mately two-thirds or over, or by one- 
third or less of the respondents. 
Chi-square tests of the School B data 
are reported in ADI Tables D, E, F, 
G. The associated C values ranged 
from .18 to .28, all Ns = 524, all ps < 
.001, tables were 2 x 3. 


Hypothesis 3 


Individuals who deviate from so- 
cially prescribed norms will tend to 
attribute their own deviation to ex- 
ternal forces, while individuals who 
do not deviate will tend to believe 
that those who do, do so because of 
internal forces (e.g, a personality 
with character defects). For example, 
if I cheat, I would claim it is because 
of unfair grading methods, because 
others also cheat, etc. (external fac- 
tors); while if you cheat, and I 
don’t, it is because you are “lazy,” 
have a “weak character,” etc. (in- 
ternal factors). 

Attitude items blaming cheating 
on external circumstances tended to 
be judged less condemning of cheat- 
ing than did those items that at- 
tributed cheating to internal, or char- 
acter defects. Therefore, if cheaters 
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agree with items explaining cheating 
to be a result of internal or per- 
sonality factors (while noncheaters 
respond oppositely), this does not 
unequivocally confirm Hypothesis 3. 
Cheaters would be expected to pre- 
fer moderately condemning state- 
ments to more severely condemning 
statements whether or not the moder- 
ate statement blames cheating on the 
environment. 

School A study. The School A test 
of Hypothesis 3 was open to this 
criticism. However, the results sug- 
gest support (though they do not 
provide critical evidence) for the hy- 
pothesis. The Ss were first divided 
into two groups on the basis of their 
tendency to endorse attitude items 
that blamed cheating on external 
forces, or to endorse attitude items 
that placed the blame on the in- 
dividual (internal forces). Those who 
scored higher on external than in- 
ternal items were placed in one 
group, and those who scored op- 
positely in a second group. This vari- 
able, as contingent upon the cheating 
score variable, was tested by means 
of chi square, yielding qualified con- 
firmation: C = .33, N = 165, p < 
.001, 2 x 5 table. (See ADI Table H). 

School B study. Four categories 
were distinguished in terms of the 
Ss’ replies to the following two atti- 
tude items. 


1. “Cheating reveals a basic defect in the 
character of the student.” 

2. “Cheating is basically wrong, but is 
sometimes necessary when too much em- 
phasis is placed upon grades.” 

Individuals who agreed with the 
first item and disagreed with the sec- 
ond were designed the “character- 
defect group (N — 169). Those who 
responded just oppositely were desig- 
nated the “social pressures" group 
(N also equaled 169). Some individ- 
uals agreed with both statements (N 
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= 82) and others disagreed with both 
(N = 103). 

These last two groups did not 
enter into the theoretical scheme of 
the investigation. The only predic- 
tion was that the “basic defect" 
group would have a larger than 
chance proportion of noncheaters, and 
that the “social pressures” group 
would have a larger than chance 
proportion of active cheaters. Hy- 
pothesis 3 was tentatively confirmed: 
C = 36, N = 338, p < 001,2 x 3 
table (ADI Table I). 

Somewhat greater confidence may 
be placed in the School B than in the 
School A results although, again, 
confirmation is not unequivocal. The 
two attitude items differ somewhat in 
the strength of their condemnation of 
cheating, but both indicate that 
cheating is “basically wrong” and 
both items were agreed to by ap- 
proximately the same percentage of 
all respondents (53% and 47%, re- 
spectively). 


Cheating and Sundry Variables 


In both schools, the cheater could 
be typified as being younger than the 
noncheater, a fraternity member, de- 
pendent upon others for financial 
support, single, a freshman or sopho- 
more, and possessor of low grades. 
(Associated chi square-based p values 
were all less than .001.) Whether 
or not a student claimed church 
membership (p > .70), or was male 
or female (p > .20), was unrelated 
to cheating behavior. 

An honor system for controlling 
cheating was not popular. This sys- 
tem was preferred by but 2896 of the 
combined samples of students. Most 
popular with both student bodies was 
a system in which the faculty do the 
police work, while the students serve 
as trial judges (50% of combined 
samples). Both cheaters and non- 
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cheaters at School A—the question 
was not asked at School B—felt that 
instructor shortcomings (42%) were 
more responsible for frequent. cheat- 
ing than were factors relating to the 
nature of the test (23%) or the 
physical testing environment (35%): 
chi-square-based p < .02. 


Discussion 


At first blush, the clinical concept 
of projection would seem sufficient to 
explain the above-reported findings. 
Cheaters perceive both themselves 
and their environment in such a way 
as to be able to maintain flattering, 
or at least acceptable ways of re- 
garding themselves. 

But individuals also act upon their 
environment and thereby change what 
is objectively at hand to be per- 
ceived. One may thus speak of selec- 
tive perception within an environment 
and, as well, of a functionally selective 
choice of environment. And of course 
a most important way in which en- 
vironments are selectively chosen is 
through the friends one chooses and 
the groups in which one seeks mem- 
bership. 

If a person behaves in a moder- 
ately unacceptable way (as, for ex- 
ample, by cheating), either the mech- 
anism of selective perception, or the 
functionally selective choice of an 
environment (of friends), may be 
sufficient to provide that, person with 
adequate protection against unflat- 
tering self-thoughts. If a person be- 
haves in a more seriously unac- 
ceptable way (as, for example, by 
indulging in homosexual behavior), it 
may become necessary for him to 
“bolster” selective perception (in the 
sense of projection), with a selective 
choice of friends who are similarly 
deviant. 

If the selective choice of friends is 
adequate to the need, one may be able 
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to "abandon" selective perception of 
the environment, as it were, and re- 
port upon his social world pretty 
much as it objectively exists—that is, 
as it would be perceived by an ob- 
server who was not ego-involved, 
and who was in an advantageous 
position for observation. If this con- 
dition existed, a socially deviant per- 
son might be able to estimate the 
extent of deviant behavior in the com- 
munity at large more accurately than 
would the nondeviant person. 

In reference to the present study, 
and according to prediction, cheaters 
more than noncheaters perceived 
others to frequently cheat. But the 
tendency for deviants to overesti- 
mate the true frequency of deviant 
behavior could not be tested. The 
data suggest that the deviant in- 
dividual’s estimate may have been 
more accurate than that of the non- 
deviant. For instance, students at 
School A who have rarely or never 
cheated (cheating scores of zero or 
one), estimated the proportion of 
“fairly regular” cheaters at their 
school to be only about 10%; while 
“frequent cheaters” (those with 
cheating scores of three or more) 
estimated the number of their fellow 
Students who cheat "fairly regu- 
larly” at around 4096. And yet 81% 
of the total School A sample admitted 
having cheated while in college and 
46% admitted having cheated during 
the semester just completed. Al- 
though the ambiguity of the phrase 
“fairly regularly” makes interpreta- 
tion uncertain, it appears possible 
that the 40% estimate made by the 
cheaters may have been more ac- 
curate than the 10% estimate of the 
noncheaters. 

Two studies of sexual deviancy— 
Finger (1947), and Melikian and 
Prothro (1954)—further illuminate 
this interpretational difficulty. These 
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investigators used American and 
Arab male college students as Ss. In 
both studies, the Ss were asked about 
their own sex practices, and they 
were also asked to estimate the fre- 
quency of these practices among 
friends, and among adult male mem- 
bers of their community. Those Ss 
who admitted deviant sex practices 
tended to believe that others also 
engaged in these same practices, and 
their estimates were higher than those 
given by Ss who denied indulging in 
deviant sex practices. 

The results of Finger, and of 
Melikian and Prothro, appear to pro- 
vide additional clear cases of projec- 
tion in the clinical sense of that 
term. And yet deviant Ss in both 
studies made more accurate esti- 
mates of the true frequency of de- 
viant sexual behavior than did non- 
deviants, as this was gauged against 
the frequency with which these same 
Ss’ admitted having themselves in- 
dulged in deviant sex practices. For 
example, pairs of estimates of fre- 
quency of deviant sexual activity 
among adult males in the community 
were available from these authors' 
data. There were two cultures (Amer- 
ican and Arab), and three "deviant" 
acts (masturbation, homosexuality, 
and heterosexuality outside of mar- 
riage), thus providing six bases for 
comparing the estimates of deviancy 
made by deviants and by nonde- 
viants. In five of those six cases, 
deviants gave more accurate esti- 
mates than did nondeviants. The 
exception concerned extramarital het- 
erosexuality in the case of American 
Ss. Deviants and nondeviants both 
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overestimated, but deviants did so 
by a factor of about 1.6 and non- 
deviants by a factor of about 1.4. 

In the present study, both those 
labeled cheaters and those labeled 
noncheaters estimated less cheating 
among friends than among the college 
population in general (School A 
study). The median estimates were 
23% and 10% respectively. However, 
the cheater group lowered their esti- 
mates less (going from college popu- 
lation to friends), than did the 
noncheater group, though not signifi- 
cantly. This also would be predicted 
from “small group theory” more read- 
ily than from “perceptual defense 
theory.” 

It appears that the results of this 
study and the studies of Finger and 
of Melikian and Prothro, can be as 
well explained by the forces de- 
termining group membership, and 
group attitude and belief systems, as 
they can by a perceptual defense 
type of hypothesis. 


REFERENCES 


Donnetts, M. D. Fads, fashions, and folde- 
rol in psychology. American Psychologist, 
1966, 21, 343-352. 

Fom, F. W. Sex beliefs and practices 
among male college students. Journal of 
Abnormal and Social Psychology, 1947, 
42, 57-67. 

FaENkEL-BRuNsWwIK, E. Personality theory 
and perception. In R. R. Blake & G. V. 
Ramsey (Eds.,), Perception: An approach 
to personality. New York: Ronald, 1951. 
Pp. 356-419. 

MELIKIAN, L. & Proruro, E. T. Sexual be- 
havior of university students in the Arab 
Near East. Journal of Abnormal and So- 
cial Psychology, 1954, 49, 59-64. 


(Received December 9, 1966) 


Journal of Educational Psychology 
1967, Vol. 58, No. 6, 386-390 


COOPERATIVE VERSUS COMPETITIVE DISCUSSION 
METHODS IN TEACHING INTRODUCTORY 
PSYCHOLOGY 


DONALD BRUCE HAINES 4x» W. J. MCKEACHIE* 
University of Michigan 


Cooperative and competitive techniques of teaching discussion sec- 
tions of general psychology were compared with respect to their effects 
on student anxiety, student achievement, and student satisfaction. 
The experiment involved 4 sections of introductory psychology. Stu- 
dents in these sections participated in class discussions conducted in 
a competitive manner for 2 weeks and with a cooperative method 
for 2 weeks. The competitive condition resulted in higher tension, 
poorer achievement in recitation, and less satisfaction than the coop- 


erative condition. 


How should a college discussion 
section be conducted? The present 
research was a direct attempt to 
compare competitively oriented and 
cooperatively oriented techniques of 
discussion. The comparison was made 
in terms of the relative amounts of 
tension produced by each technique, 
the effect of each technique on stu- 
dent performance, and the effects of 
the techniques on student satisfac- 
tion and recall, 

A definitional scheme suggested by 
Deutsch (1949) assumes that in a 
group each member has certain ends 
he wishes to attain; that is, goals 
which have high attraction (or va- 
lence). To reach these goals, the in- 
dividual may or may not have to 
depend on the behavior of other 
group members. In cooperation, what 


*Donald Bruce Haines was killed in a 
plane crash in Ethiopia in August 1965 while 
doing research for the Aerospace Medical 
Laboratories of Wright-Patterson Air Force 
Base. This abridgement of his doctoral dis- 
sertation was prepared by W, J. McKeachie, 
his doctoral thesis chairman at the Univer- 
sity of Michigan, who benefited from sugges- 
tions by Theodore Newcomb and members 
of the staff of the project of the United 
States Office of Education Research Con- 
tract OE. No. SAE-8451 to W. J. Mc- 
Keachie, J. E. Milholland, and R. L. Isaac- 
son. 


an individual does to help himself 
helps others; in competition, any- 
thing one does to help himself pre- 
vents others from moving toward 
their goals. In Deutsch’s experiment, 
there was no clear difference be- 
tween competitive and cooperative 
groups in terms of the amount of in- 
dividual learning. 


Hypotheses 


1. There will be a higher level of 
group tension in competitive condi- 
tions than in cooperative conditions. 

Both cooperative and competitive 
groups have tensions increased by the 
nature of the task. However, there is 
a difference in the instrumental be- 
haviors leading to tension perceived 
by the individuals in the competitive 
group relative to the cooperative 
groups. Individuals in cooperative 
groups have many different paths to 
achieving their goals. This is not 
true for those in competitive groups. 
Consequently there is much greater 
probability that cooperative group 
members will have tension associated 
with the task reduced by the action 
of other members. The reverse is true 
for competitive groups. Here the 
probability is high that tension as- 
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sociated with task will be increased 
by the activity of other members. 

2. The higher tension levels in 
competitive conditions will result in 
disruption of performance in those 
conditions as compared with per- 
formance in a cooperative condition. 

3. Unreduced tensions associated 
with tasks begun by one individual 
and completed by another in the 
group will be greater for competitive 
conditions than for cooperative con- 
ditions. ; 

Under cooperation, the completion 
of a task moves everyone toward the 
goal, regardless of who began the 
task, but this is not so under competi- 
tion. There only the person complet- 
ing the task moves toward his goal. 

4. The unpleasantness of high ten- 
sion levels in competitive conditions 
relative to tension levels in cooperative 
conditions will result in member pref- 
erence for and satisfaction with co- 
operative over competitive conditions 
in the classroom. 


Metuop 


Subjects 


Eighty-two undergraduates enrolled in 
four sections of the introductory psychology 
course at the University of Michigan were 
used as subjects. The subjects were told 
that 4 weeks of the semester would be 
treated as experimental sessions in the sense 
that particular grading policies would be 
followed and that they would be observed 
during this period by two graduate students 
sitting in the class. 

To check comparability of groups, a num- 
ber of measures were gathered on the second 
day of class. Intellectual capacity was as- 
sessed by obtaining individual student 
American Council for Education (ACE) 
scores from the Bureau of Psychological 
Services. Class level, area of concentration, 
age, and sex were taken from a personal 
data sheet circulated to all students in the 
experimental groups. Median breaks were 
made for each of these variables and a chi- 
square test for differences between sections 
was conducted. For none of the variables 
did differences approach significance. 


Enrollment figures were as follows: two 
sections of 20 students each, and two sec- 
tions of 21 each. 


Variables 


The independent variable for this re- 
search was teaching technique, which was 
varied so as to result in a competitive atmos- 
phere and a cooperative atmosphere. In both 
conditions, students were instructed that 
part of their final grade was dependent upon 
their recitation performance, but in the co- 
operative sessions they were also told that 
anything one individual did to help himself 
reach the goal (viz, answering recitation 
questions correctly, thus getting a higher 
final grade) automatically moved everyone 
closer to each of their goals (i.e., when one 
student answered a question correctly, every- 
one got credit toward his individual final 
grade). In the competitive situation, how- 
ever, each student's grade depended on how 
well he answered questions as compared with 
the other students. 

Control for content and sequence was 
achieved by adopting the following balanced 
design: 

Classes 1 and 3: 2 weeks of cooperative 
technique + 2 weeks of competitive 
technique. 

Classes 2 and 4: 2 weeks of competitive 
technique + 2 weeks of cooperative 
technique. 

Two instructors each taught two classes. 
Each class was its own control. 


Dependent Variables 


Dependent variables fall into three cate- 
gories: (a) assessments of tension level, (b) 
assessments of performance, and (c) assess- 
ments of satisfaction. 


Assessments of Tension Level 


Three procedures were used to get at the 
arousal and maintenance of tension systems 
in the individual and in the group. Two 
items on a questionnaire assessed tension 
level in the individual: “This technique 
promotes an easy, relaxed atmosphere in 
class.” “This technique made me feel anx- 
ious and uneasy.” 

The second means of assessing tension 
levels in the group consisted of the use of 
two independent observers who categorized 
interaction of subjects during the experi- 
mental sessions in terms of the Fouriezos, 
Hutt, and Guetzkow (1950) observational 
technique. 

The final means of assessing the unre- 
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duced tensions in the group consisted of a 
measure of the Zeigarnik effect produced 
when students volunteered for questions, 
but were unable to answer them completely. 
The Zeigarnik effect is observable in group 
functioning. Horwitz (1954) tested this by 
seeing if the Zeigarnik effect existed for in- 
dividuals in group goal situations and dem- 
onstrated that it did. It would, then, be ex- 
pected in our study that the tension aroused 
by attempting an answer and failing to 
complete it would be more likely to be re- 
duced by the success of another student in 
the cooperative group than in the competi- 
tive group. 

A Zeigarnik Recall Form was constructed 
which simply asked each student to recall as 
accurately as possible each question he vol- 
unteered for and tried to answer, whether 
credit. was given or not. These forms were 
cireulated and collected during the final ex- 
perimental session of each condition. 


Assessment of Performance 


A daily measure of recitation perform- 
ance was obtained by providing each in- 
structor with a seating chart labeled with 
the names of students in each of his sec- 
tions, and then having him score students 
attempting answers to questions. The pro- 
cedure selected for discussion periods was 
that of recitation-drill, following the form 
described by Guetzkow, Kelly, and Mc- 
Keachie (1954). 

The recitation-drill approach consists of 
bringing to class a prepared list of questions 
on specifically assigned textbook material. 
Depending upon the experimental condi- 
tions, the student or his group was given 
credit for answering the question correctly. 
If the answer was incorrect or incomplete, 
the credit was lost. The instructor kept ask- 
ing the question until someone replied cor- 
rectly. If no one was able to do so, the 
whole group lost credit in the cooperative 
condition, and as many students as volun- 
teered lost credit in the competitive condi- 
tion. When everyone volunteered in both 
situations, the amount lost or gained was 
identical. To the extent that not everyone 
volunteered in the competitive group, then 
the cooperative group suffered a heavier 
penalty. If anything, this inequity operated 
against the major hypothesis, and hence 
was allowed to remain in the design. The 
final measure of group performance for all 
conditions consisted of the number of ques- 
tions covered per minute for a given ses- 
sion, 


The other measure of group performance 
consisted of two scores on the regular course 
hour-long examination. The hour examina- 
tion consisted of two parts: Part 1, cover- 
ing the first 2 weeks of the course (which 
for two sections consisted of the coopera- 
tive condition and for two was the competi- 
tive condition), and Part 2, which covered 
the last 2 weeks of the experimental ses- 
sions. The examination was multiple-choice, 
containing 40 questions with 20 to each part. 


Assessment of Satisfaction 


The following items of the student post- 
Session questionnaire comprised the measure 
of satisfaction: (a) “I preferred this toch- 
nique to the other one"; (b) “This technique 
made me doubt my own abilities and low- 
ered my self-assurance”; and (c) “I would 
enjoy being taught by this technique.” 


RESULTS 


The experimental results support 
the hypotheses. The results indicate 
that the independent variable of dis- 
cussion technique was effectively 
manipulated, and that crucial control 
measures did in fact remain constant 
across sections. In all, 18 predictions 
were made concerning manipulations, 
controls, and dependent measures. Of 
these, 15 were significantly supported, 
two were in the predicted direction 
but were not significant, and one of 
the 18 predictions was neither sig- 
nificant nor in the predicted direc- 
tion. 


Levels of Tension 


The major hypothesis for this re- 
search stated that relative to co- 
operative conditions in the classroom, 
competitive conditions produce an 
excessive tension level, as evidenced 
by greater incidence of self-oriented 
need, lack of relaxed atmosphere, and 
feelings of anxiety and uneasiness. 
Students in the competitive sessions 
consistently showed a greater inci- 
dence of self-oriented need per act 
(as assessed by observers) compared 
with their behavior in the cooperative 


——— o Áo €— 


Coorsrativs versus CowrErITTVE Discussion METHODS IN Tracuina PsycHoLocy 389 


TABLE 1 
INCIDENCE oF SELF-ORIENTED NEED: OVER- 
ALL ÅSSESSMENT PER SESSION FOR 
COOPERATIVE VERSUS COMPETITIVE 


TECHNIQUES 
2.68 1.12 Comp. 9.4* 
Coop. 


„ani eaten ME E 

a Incidence between conditions. (Each 
class paired with itself—Ist session coopera- 
tive with Ist session competitive, etc.) 

>¢ test of direct differences computed. 
N = 16. 

* p < 001. 


sessions. (F = 84.50, df = 1/24, p < 
.001.) 

Similar results hold for the in- 
cidence of self-oriented need assessed 
as an overall measure for each ses- 
sion. 

Questionnaire items (postsession 
questionnaire) relating to anxiety 
perceived by the individual also 
clearly bear out the hypothesis. Stu- 
dents in the competitive sessions felt 
distinctly more tense and anxious 
than they did in the cooperative ses- 
sions. (See Table 2.) 

In general then, we have provided 


TABLE 2 
DIFFERENCES IN SELF-REPORTED ANXIETY 
AND TENSION DURING Discussion 


»- 


&nxioi e 

anxious and un- | je —1.81| 27 |11.02* 

Note.—Likert-i scale. Strongly Agree (3) to 
ET 


documentation that the utilization of 
competitive discussion procedures un- 
der the controlled conditions specified 
in this research can and does lead te 
high levels of tension in the class- 
room. 


Disruption of Performance 

In daily recitation, the perform- 
ance measure was the number of 
questions covered during the class 
session. Students in the cooperative 
sessions covered more questions than 
they did in the competitive session. 
(F = 5.65, df = 1/24, p < 05.) 
Examination performance of the two 
groups, however, did not differ sig- 
nificantly. 


Unreduced Tensions 


The hypothesis that there should 
be less unresolved tension in coopera- 
tive than competitive classes was ten- 
tatively supported. (F = 3.97, df = 1/ 
24,p < .10.) 


Satisfaction 


The final prediction of the present 
research was that tension levels high 
enough to lead to anxiety, incidence 
of self-oriented need, and disruption 
of performance would result in low- 
ered satisfaction and preference for 
the conditions resulting in these un- 


TABLE 3 
SATISFACTION WITH AND PREFERENCE FOR 
THE Discussion TECHNIQUES 


5, 78° 


pe ae ee PME EE am 
Note.—Likert-type scale. Strongly Agree (48) to 


Strongly Disagree (—3). 
Big fiscales wa computed. N = 8 
p A. 
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pleasant. consequences. Table 3 illus- 
trates that students did prefer the 
cooperative method. 


Discussion 


The nature of the goal interdepend- 
ency structured in the college class- 
room has a powerful effect upon stu- 
dent behavior. Today’s student has so 
many conflicting demands placed upon 
him by extracurricular activities and 
varied social responsibilities that the 
classroom atmosphere must be potent 
indeed to attract and hold even a part 
of the student’s interests. The pre- 
vailing answer has been to promote 
deliberately a keenly competitive at- 
mosphere in the classroom with the 
hope that students will be motivated 
and devote more energy to school 
work,  . : 

It is presumptuous to question the 
effectiveness of competition, espe- 
cially when it is so deeply implicit in 
modern education, unless a sound 
basis for demonstrating otherwise can 
be established. The present research 
has provided such a basis. Competi- 
tion, in particular a competitive 
grading policy, arouses tension in the 
individual. Many find it agreeable 
that competition should do so. What 
is little recognized, however, is that 
the contribution made by competition 


to tensions already’ existing in the 
student is so great that undesirable 
consequences may follow. The present 
research demonstrated that students 
in competitive discussion situations 
became more anxious, displayed a 
greater incidence of  self-oriented 
needs, and found themselves losing 
self-assurance. Further, they were less 
able to perform effectively in recita- 
tion, and they became dissatisfied 
with the discussion procedure, When 
the discussion was structured coop- 
eratively, students felt less tense, dis- 
played more task-oriented behavior, 
worked more effectively, and enjoyed 
the discussion. 


REFERENCES 


DzurscH, M. A theory of cooperation and 
competition. Human Relations, 1949, 2, 
129-152. 


Founmzos, N., Hurt, M., & Guerzxow, H. 
Measurement of self-oriented need in dis- 
cussion groups. Journal of Abnormal and 
Social Psychology, 1950, 45, 682-690. 

Guerzxow, H., Katty, E, & McKracHIE, 
W. An experimental comparison of reci- 
tation, discussion and tutorial methods in 
college teaching. Journal of Educational 
Psychology, 1954, 45, 193-207. 

Horwitz, M. The recall of interrupted 
group tasks: An experimental study of 
individual motivation in relation to group 
goals. Human Relations, 1954, 7, 3-38. 


(Received December 16, 1966) 


mu E 


Oxford University Press 


Fundamentals of Measurement: 


TECHNIQUES AND PRACTICES 
SECOND EDITION 


-org 
a wealth of illustrative material especially with respect to multiple choice and other types 
of objective test items. It can challenge the very able student and still be meaningful to 
the bulk of beginning students." —Thomas D. McSweeney, University of San Francisco 


1967 484 pp. tables and figures $6.50 


By SIDNEY W. TIEDT, San Jose State College 


Directed to those in teacher education, in-service programs for public school teachers, and 
institutes focusing on the teaching of deprived children, this work provides an instructional 
guide that suggests effective methods, materials, and media for reaching disadvantaged 
children and meeting their special needs. The eight chapters, written s) ifically for this 
book, are by experts in the various fields of education: John Morlan, obert Ramonda, 
Sidney W. Tiedt, Tris M. Tiedt, Weldon R. Parker, Allen C. Friebel, and Charles Carter. 


Spring 1968 300 pp- 30 illus. prob. $4.50 


Special Education: 


CHILDREN WITH LEARNING PROBLEMS 
By ROGER REGER, WENDY SCHROEDER and KATHIE USCHOLD, 
Special Educational Services 

This new text is intended for those interested in the formal education of children who 
deviate from the average in their learning patterns or general behavior as a result of emo- 
tional disturbances or minimal brain injuries. Emphasis is on in vidualized programs with 
specific outlines for implementing action in the classroom. 

Spring 1968 450 pp. 10 illus. 


Psychological Diagnosis 
in Clinical Practice: 


WITH APPLICATIONS IN MEDICINE, 

LAW, EDUCATION, NURSING, AND SOCIAL WORK 

By BENJAMIN POPE and WINFIELD H. SCOTT, both of The Psy- 

chiatric Institute, University of Maryland School of Medicine i 

Dealing wi clinical applicati logical tests, this stud; hasizes principles 

or ates Mega der NM h lures of cucu Practitioners of 

fields, will find vee see useful, yea whe dp se 
c eiue p 5 

i ing, E. "is Ripe E SERATE of illustrations, includ- 
ing ten color plates of the Rorschach inkblot tests, complement the study. 
1967 360 pp. 18 illus. (10 in color) $8.00 


i OXFORD Wf UNIVERSITY W PRESS 
200 Madison Avenue, New York, N.Y. 10016 


$6.00 


x Harper & Row 


PSYCHOLOGICAL FOUNDATIONS 


OF EDUCATION 2nd Edition 
Morris L. Bigge and Maurice P. Hunt 


This well-documented text takes a semihistorical and comparative approach, 
The authors have reworked each chapter to update references, improve 
continuity and style, and add new material. Part I, unique in scope, treats 
the biology, psychology, and sociology of human nature. Part II deals with 
child and youth development, both physiologically and psychologically. 
Part III, on learning theory, presents a more penetrating discussion of this 
topic than can be found in competing texts at present. Part IV demonstrates 
the practical classroom application of theoretical and factual material in the 
preceding sections of the text. Instructor’s Manual. 603 pages. $9.95. January. 


CHILD DEVELOPMENT AND PERSONALITY 
2nd Edition 


Paul H. Mussen, John Janeway Conger, and Jerome Kagae 


There have been over 675 adoptions of the Second Edition of this standard 
text. 


“The first edition by Mussen and Conger was excellent as an undergraduate 
text. The second edition... maintains the standard of excellence. . . . 
Highly recommended as being literate, well documented, and well con- 
ceived.”—Child Development Abstracts and Bibliography 625 pages. $8.75. 


READINGS IN 
CHILD DEVELOPMENT AND PERSONALITY 


Paul H. Mussen, John Janeway Conger, and Jerome Kagan 


These selections follow a chronological and longitudinal plan and deal with 
critical issues in infancy, the preschool years, the early school years, and 
adolescence. A brief introductory essay for each section sets the context for 
the readings and correlates the problems discussed in the sections. Two 
previously unpublished readings are included. 480 pages. Paper. $5.50. 


PSYCHOLOGY IN THE CLASSROOM, 
2nd Edition 


Rudolf Dreikurs 


This practical manual provides the prospective and in-service teacher with 
the background information and methods necessary to deal effectively with 
behavior problems and learning deficiences of students. Grounded in the 
philosophy of democracy and the socio-teleological approach of Adlerian 
psychology, this new edition enlarges on the significance and techniques of 
group approaches in the classroom, particularly the use of group discussions. 
286 pages. Paper. $3.75. Just Published. 


STIMULUS AND RESPONSE 
John A. Barlow 


Teaching a theory of why and how learning occurs, this entirely original 
formulation of operant psychology consists of 13 lessons and four experi- 
mental projects that can conveniently be carried out by the student without 
equipment or supervision. The lessons are prepared in a self-instructional 
style called “conversational pro} ing." Instructor’s Manual, with 


additional evaluative data. 199 pages. Paper. $4.75. January. 


LEARNING AND HUMAN ABILITIES: 
EDUCATIONAL PSYCHOLOGY, 2nd Edition 


Herbert J. Klausmeier and William Goodwin 
Almost completely rewritten and thoroughly updated, this edition continues 
to emphasize the concept of emerging human abilities, thus integrating the 
treatment of growth and learning. Course outlines have been prepared for 
the instructor, and suggest a variety of combinations of all or parts of each 
of the 18 chapters. A Student Evaluation Guide is available to teachers. 


720 pages. $8.95. 


A Student Workbook and Adjunct Program, by Herbert J. Klausmeier, 
William Goodwin, and Robert Conry is available for use with the text. 
$3.25. 


RECORDING AND ANALYZING 
CHILD BEHAVIOR 
with Ecological Data from an American Town 
Herbert F. Wright 


This text presents a method of describing the situations and behavior of 


individuals in their natural habitats. Conceptual foundations of the method 
as well as results of its application are given. 291 pages. Paper. $4.50. 


rper & ‘Row, Publishers e 49 E. 33d Street, N.Y. 


tes A EU 


PROCEEDINGS 


of the 75th Annual Convention of the 
American Psychological Association 1967 


The purpose of the PROCEEDINGS is to facilitate the dissemination of information 
regarding current work in major fields of psychology by making available research reports 
with a minimum of lag and well in advance of the Convention. The 1967 PROCEED- 
INGS, published in July, contains approximately 200 reports distributed across the 
major subject areas of psychology. 


The 1967 Convention PROCEEDINGS will contain the contributed papers of the fol- 
lowing Divisions: 


1. Division of General Psychology 14. Division of Industrial Psychology 


8. Division of Experimental Psychology 16, Division of Educational Psychology 
6, Division of Physiological and Com- i : 


ioe Paychology 16. Division of School Psychologists 
7. Division of Developmental. Psychology 17. Division of Counseling Psychology 
12. Division of Clinical Psychology 19. Division of Military Psychology 


PRICE: $9.00 per copy ($4.50 to APA members) 


Copies of the 1966 PROCEEDINGS are still available at $6.00 per copy 
($3.00 to APA members) 


AMERICAN PSYCHOLOGICAL ASSOCIATION 
1200 17th Street, N.W. 
Washington, D.C. 20036 
PLEASE SEND ME_____copy(ies) of the 1967 PROCEEDINGS 
O Remittance Enclosed [0 Bill Me (when issue mailed) 
NAME 


ADDRESS 


Zip Code. 


_ The 1966 PROCEEDINGS was the first issue as a permanent publication. The 1965 
issue was part of a research project on Scientific Information Exchange in Psychology. 


Journal of Educational Psychology 
‘1967, Vol. 58, No. 6, 1-27, Part 2 


MNEMONIC SYSTEMS IN RECALL’ 


GORDON WOOD 
Michigan State University 


A series of 5 experiments was conducted to determine if the use of a 
mnemonic system influences recall. In Experiment I an attempt was 
made to ascertain what elements of the mnemonic system, if any, 
were responsible for improving recall. Presentation rate (Experiment 
ID, transfer paradigm (Experiment IID, type of list (Experiment 
TV), and list abstractness (Experiment V) were manipulated to assess 


whether these variables differentially affected, relative to a control, 


the recall of Ss utilizing a particular mnemonic system. The major 
findings were: When Ss were presented with an additional list (peg 
list) for the learning and recall trial and instructed to make asso- 


ciations, during, the learning 


trial, between the peg words and the 


words to be recalled (response words), recall of the response words 
was markedly facilitated. The Ss presented only the response list 
and instructed to link successive words of the response list during 
the learning trial also had facilitated recall. An interaction between 


presentation rate and instruction 


condition was obtained, suggesting 


that a slower rate of presentation is optimal for Ss employing à 
mnemonic system. Instruction condition did not interact with 
transfer paradigm, type of list, and list abstractness. Implications of 


these findings were discussed. 


—_ 


The importance of mnemonic de- 
vices for the acquisition of associa- 
tions between verbal units has been 
suggested by some writers (e.g., 
Bugelski, 1962; Runquist & Farley, 
1964), but the experimental investi- 
gation of mmemonic devices has 
been, for the most part, neglected. 
The present series of experiments is 
designed to investigate some of the 
aspects of memory-training systems 
(mnemonic systems). The study at- 
tempts in part to determine under 
what circumstances, if any, memory 
systems facilitate learning, the ele- 
ments of the mnemonic device re- 
sponsible for the “facilitated” learn- 
ing, and the relationships between 
mnemonic devices and some other 


*Based on a doctoral dissertation sub- 
mitted to the Department of Psychology, 
Northwestern University, in partial fulll- 
ment of the requirements for the degree of 
Doctor of Philosophy. Special thanks are 
due Carl P. Duncan, chairman of the author's 
committee, for his helpful comments and 
criticisms. "Thanks are also due Albert Erle- 
bacher, Winfred F. Hill, and Benton J. Un- 
derwood. 


variables which have been demon- 
strated to affect learning. 

Although there are slight varia- 
tions among mnemonic systems (e.g., 
Furst, 1957; Nutt, 1941; Roth, 1961), 
the basic components of the systems 
are essentially the same. The first 
phase of most systems involves the 
memorizing of a series 0 “pegs.” 
Generally, words are used as pegs 
and each word is numbered, with as- 
sociations between numbers and 
words, for example, one is a BUN, two 
is a SHOE, three is à TREE, four is a 
poor, etc. Following the memoriza- 
tion of the peg words, new words can 
be memorized by using bizarre im- 
agery to connect the new words (re- 
sponse words) to the peg words. For 
example, given that automobile is the 
first response word (word to be mem- 
orized), the task is to conjure up & 
bizarre image connecting the first 

g, BUN, and automobile. A possible 
image might be a 5-foot bun driving 
an automobile. The same principle is 
used for all the remaining response 
words. Once a bizarre image has been 


2 Goron Woop 


formed between a peg word and a 
response word, rehearsing either the 
words or the image is believed to be 
unnecessary. To recall any of the re- 
sponse words, it is supposedly only 
necessary to recall the well-mem- 
orized peg words. The peg words are 
presumed to elicit readily the visual 
images previously formed and thus 
make the response words easily avail- 
able. Thus, the peg word is considered 
to be an effective stimulus for the 
word to be recalled. 

There is some evidence to support 
the claim made by the proponents of 
mnemonic systems (e.g., Roth, 1961) 
that the use of a mnemonic system re- 
sults in superior recall. Wallace, 
Turner, and Perkins (1957) presented 
pairs of English words to subjects 
(Ss) and had them, working at their 
own pace, form a visual image be- 
tween the two words. Only one trial 
was given on each pair. The Ss started 
with lists of 25 pairs and worked up 
to lists of 700 pairs. Recall of the 
response word upon being given the 
stimulus word was about 99% up to 
500 pairs; at 700 pairs recall dropped 
to about 95%. Although the authors 
did not utilize a control, this level of 
recall is greater than what might be 
expected from a group not employ- 
ing imagery. Also, there was some 
evidence to indicate that the time re- 
quired to make each association de- 
creased with practice. That is, an S 
given eight lists of 25 word pairs 
and instructed to form associations as 
quickly as possible had an average 
association time per pair of 20.4 sec- 
onds on List 1 and an average asso- 
ciation time per pair of 3.0 seconds 
on Lists 6, 7, and 8. 

Smith and Noble (1965) tested 
Furst’s (1957) mnemonic system at 
three levels of meaningfulness. The 
experimental groups were given a 1- 
hour lecture-demonstration on Furst’s 


hook method of mnemonics, followed 
by 4 days of “private practice” in 
imagery. All Ss were given 20 serial 
learning trials on a consonant-vowel- 
consonant (CVC) list of 10 items. 
After 24 hours Ss returned for 10 
additional trials on the previously 
presented list. Over all, the experi- 
mental groups (those receiving the 
special training) showed less loss over 
the 24-hour rest period than the con- 
trols. Yet a complex relationship was 
found between the levels of meaning- 
fulness and the effectiveness of the 
method. There were no differences 
between experimental and control 
groups for highly meaningful lists, 
large differences for the lists of 
medium meaningfulness, and small 
differences for the low meaningful 
lists. 

Balaban (1910) had Ss learn serial 
lists of 40 words under two different 
sets of instructions. One-half the Ss 
were to use a “mechanical” method, 
while the other half were to form 
“conscious” connections, by the use of 
mnemonics, between successive pairs 
of words. The author found that Ss 
using mnemonics did better than those 
instructed to take a “mechanical” ap- 
proach, 

Although the above studies are 
somewhat supportive of the position 
that the use of a mnemonic system 
can facilitate recall, more evidence is 
clearly needed. It has not been estab- 
lished what elements of the mnemonic 
system, if any, are responsible for 
facilitated recall or under what con- 
ditions these elements facilitate re- 
call. Also, no attempt has been made 
to determine what variables influence 
the performance of Ss employing a 
mnemonic system. That is, it is con- 
ceivable that variables known to in- 
fluence the performance of Ss unin- 
structed in the use of a mnemonic 
system may not affect the perform- 
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ance of Ss utilizing a mnemonic sys- 
tem. 


Exprrment I 


The purpose of this experiment was 
to determine if a memory system em- 
ploying peg words, imagery which is 
bizarre, and nonrehearsal of prior im- 
ages facilitates performance relative 
to the appropriate control. In addi- 
tion, the importance of the individual 
components of the mnemonic system 
(namely, presence of peg, image, 
bizarreness of image, and nonre- 
hearsal of previous images) was eval- 
uated. It was not expected that all the 
components of the mnemonic system 
would be of equal importance. Since 
the use of pegs and imagery are the 
basis of the system, and nonrehearsal 
of previous images and bizarreness of 
the image are subsidiary components, 
the former were expected to produce 
greater facilitation of performance 
than the latter. 


Method 


Design. In four of the seven conditions Ss 
were given answer sheets on which the peg 
words were written. The four groups given 
the peg words differed with respect to the 
strategy they were asked to use to learn the 
words that were read to them (response 
words). Group 1 was told to make a bizarre 
image which incorporated the response word 
and peg word, and not to rehearse previously 
formed bizarre images. Group 2 was given 
the same instructions as Group 1 except that 
they were told to rehearse earlier formed 
images during any time not devoted to form- 
ing an image to link a peg and response 
word. Group 3 was given the same instruc- 
tions as Group 1 except that they were told 
to form common (rather than bizarre) im- 
ages. Group 4 was told to make a verbal 
association between the word to be mem- 
orized and the peg word. For example, the 
verbal association might involve using & 
third word as a mediator between the re- 
sponse word and the peg word. 

_ The remaining three groups were not 
given peg words. Group 5 was instructed to 
make a bizarre image for each response word. 
For example, if the first word is automobile, 


S might form an image of an automobile 15 
feet high, 2 feet wide, and 4 feet long. Group 
6 was told to link response words with a 
bizarre image. For example, if the first, sec- 
ond, and third response words are automo- 
bile, saltshaker, and dog respectively, 8 
might form an image of a large saltshaker 
driving an automobile shaped like a dog. 
Group 7 was left to its own devices. That is, 
these Ss received standard free-recall learn- 
ing instructions. 

To determine if there was a differential 
practice effect (nonspecific transfer) as a 
function of the method used to learn the 
words, each S was presented three different 
lists. Murdock (1960) has evidence to indi- 
cate that the performance of Ss given stand- 
ard free-recall learning instructions and only 
a few trials on each successive list does not 
improye with practice. Thus, one might ex- 
pect that there would be greater positive 
nonspecific transfer in the imagery conditions 
than in the standard free-recall-instructions 
condition if Ss decrease the time needed to 
form an image as they become practiced 
(Wallace et al., 1957). To measure the prac- 
tice effects, the order of presentation of three 
lists was completely counterbalanced for 
each group. Since there were 18 Ss in each 
condition, there were three replications of 
each order of list presentation. 

Subjects. The Ss were 126 students from 
introductory psychology classes. Eighteen Ss 
were assigned to each of the seven conditions 
by randomizing the order of the seven con- 
ditions 18 times, then assigning Ss in the or- 
der of their appearance to the randomized 
list of conditions. In actuality, the two con- 
ditions of Experiment III (see below) were 
run at the same time as the seven conditions 
of Experiment I. Thus the order of nine con- 
ditions was randomized 18 times and 162 8s 
were utilized. However, clarity is more 
easily maintained if only the seven condi- 
tions of Experiment I are considered at this 


time. 

Materials. Six lists of 40 words each were 
used in this experiment (Table A1, Appen- 
dix A). All but 21 of the 240 words were 
taken from the Underwood and Richardson 
(1956) norms. The additional 21 words were 
selected with the only restriction that they 
be, as are the 219 Underwood and Richard- 
son words, relatively high-frequency con- 
crete nouns. These words were selected be- 
cause concrete words are believed to elicit 
images more readily than abstract words 
(Paivio, 1905). The 240 words were ran- 
domly divided into six lists of 40 words each 
with the only restriction that each list con- 
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tain approximately the same number of 
words starting with any specified letter. Of 
the six lists, three were randomly selected to 
serve as peg lists and three as response lists. 
The order of presentation of each peg list 
and response list, and thus the pairings of 
the peg and response words, was random 
with the restriction that no peg and response 
word pair had the same initial letter. 
Procedure. The Ss were run in groups of 
seven, all seven conditions being represented 
in any one group session. On entering the ex- 
perimental room, Ss were seated and then 
given a five-page handout. On page 1 (cover 
page) a note cautioned S not to turn the 
cover page until told to do so. On page 2 the 
special instructions for each condition were 
printed. That is, each S was told how to 
attempt to learn the lists of response words. 
The special instructions were in considerable 
detail; examples of each strategy were given 
in the instructions. To demonstrate the na- 
ture of the instructions, the general instruc- 
tions given to all groups and the special in- 
structions for Groups 1, 4, 6, and 7 are pre- 
sented in Appendix B. Pages 3-5 were 
answer sheets. A peg-word list was printed 
on each of the answer sheets for Ss in Groups 
1-4. In any one set of seven Ss run at one 
time, the order of the answer sheets was the 
same, but the order of the answer sheets, 
and therefore the order of presentation of 
the response-word lists, varied from set to 
set. Each S served in only one session. The 
peg word for any given response word was 
the same for all Ss in Groups 1-4. The three 
answer sheets for each S not receiving peg 
words were identical. 
Prior to the turning of the cover page, Ss 
were given general information about the 
experiment. They were told the number of 
lists to be learned (three), number of words 
per list (40), presentation rate of the list (5 
seconds per word), and mode of list presen- 
tation (tape recorder). After this information 
was given, Ss were given an opportunity to 
ask questions. After all questions had been 
answered, the experimenter (E) explained 
that each S was to employ a particular strat- 
egy in the learning of the three lists. The E 
stressed the importance of employing only 
the strategy explained on page 2 of the 
handouts. The Ss were then instructed to 
turn to page 2 of their handouts and study 
their instructions. The E cautioned the Ss 
not to ask questions about their individual 
instructions, but, instead, to study their in- 
structions until they fully understood the 
Strategy they were to use. After each S in- 
formed E that the instructions were com- 


pletely understood, E turned on the Wol- 
lensak tape recorder on which the response 
words had previously been recorded. Fol- 
lowing the reading of each list, E stopped 
the tape recorder and asked Ss to write as 
many of the words as they could on their 
answer sheet. After Ss indicated that no 
more time was necessary, E instructed Ss 
to turn to their next answer sheet and E 
started the tape of the next list. Following 
recall of the third list, Ss were cautioned not 
to discuss the experiment with anyone and 
then were dismissed. 


Results 


Each S's answer sheets were scored 
to determine the number of response 
words correctly recalled, disregarding 
order of recall. Spelling and number 
(ie. singular vs. plural) errors were 
disregarded in scoring. Homonyms 
(e.g., steak for stake) were accepted 
as correct, but incomplete responses 
(e.g., snake for rattlesnake) and mis- 
takes apparently due to misunder- 
standing the intended tape word (e.g., 
teach for peach) were not accepted 
as correct. These scoring criteria were 
used for all experiments. 

"The mean number of words recalled 
as a function of stage of practice for 
each of the seven conditions is pre- 
sented in Figure 1. An examination 
of Figure 1 reveals that the groups 
Eiven peg words (Conditions 1-4) 
performed better than the three 
groups not given peg words. The 
group instructed to rehearse pre- 
viously formed images (Group 2), the 
group instructed to form common 
images (Group 3), and the group told 
to use verbal mediation rather than 
imagery (Group 4) all performed 
slightly better than Group 1. Yet, 
because these differences were not 
significant by either a Dunnett or 
Duncan (p > .05) range test, using 
the total number of words recalled for 
the three lists, this experiment did not 
support the notion that rehearsing 
previously formed images, bizarreness 
of the image formed, or imagery (as 
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opposed to verbal mediation) signifi- 
cantly influence performance. More- 
over, since Groups 14 were superior 
to Group 7 (control) by either a 
Dunnett (p < .01) or Duncan (p < 
001) range test, it seems plausible to 
assume that the presence of the peg 
words was responsible for the facili- 
tated performance. 

The performance of Group 6 (in- 
structed to link words with bizarre 
images) demonstrated that factors 
other than the presence of peg words 
can result in facilitated recall. Since 
Group 6 was not given peg words but 
still performed at a level significantly 
(Duncan test, p < .01) superior to 
the control (Group 7), it seems ap- 
parent that the presence of peg words 
is not the sole means by which im- 
proved recall can be obtained. Yet 
the results suggest that the presence 
of peg words produces a greater 
facilitation of performance than does 
giving instructions to link successive 
response words. Both a Duncan test 
and a Dunnett test (p > .05) on the 
total number of words correctly re- 
called indicated that Groups 1 and 6 
did not differ significantly, but 
Groups 2-4 were significantly superior 
to Group 6 by Duncan’s test (p € 
05) 


As can be seen in Figure 1, the per- 
formance of Groups 5 and 7 was 
markedly inferior to that of the other 
five groups. The difference between 
Groups 5 and 7 was not significant, 
but both of these groups were signifi- 
cantly inferior (Duncan, P < .001) 
to the other five groups. Since Groups 
5 and 6 differed only with respect to 
the instruction to link words with a 
bizarre image (Group 6) or to form 
a distinct bizarre image for each 
word (Group 5), it is probable that 
linking of items during learning is re- 
Sponsible for the facilitated recall of 
Group 6. 


Since the groups given peg words 
were told to link the peg words and 
response words during learning, the 
peg and response words should be cor- 
rectly paired at recall The mean 
number of words recalled in the cor- 
rect position (ie., correctly paired 
with the appropriate peg word) was 
computed for Groups 1—4. This meas- 
ure strongly supported the notion that 
Ss linked the peg words and response 
words during learning in that the 
mean number of misplaced items per 
list was only .54, .43, .26, and .24 for 
Groups 1-4 respectively. These groups 
not only recalled more words, but 
they recalled them in the correct 
order. The mean number of words re- 
called in the correct position (i.e., in 
the correct blank of the answer sheet) 
over all three lists was only 3.44, 6.28, 
and 4.67 for Ss in Groups 5-7. Thus, 
if the mean number of words recalled 
in the correct position is used as a 
response measure, Groups 1—4 were 
vastly superior to Groups 5-7. 

The number of words correctly re- 
called as a function of serial position 
was obtained to determine whether 
instruction condition influenced the 
customary bowed serial-position ef- 
fect of free recall (Deese & Kaufman, 
1957). The serial-position curves for 
the first list presented to S are given 
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Fig, 1. Mean recall as a function of stage 
of practice for the seven instruction condi- 
tions of Experiment I. 
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Fia. 2. The serial position curves for the seven instruction conditions of Experiment I. 


in Figure 2. From this figure it can be 
seen that the groups not given peg 
words (Groups 5-7) showed the typi- 
cal bowed serial-position effect while 
the groups given peg words (Groups 
1-4) did not. Groups 5—7 had as large 
a primacy as a recency effect. Al- 
though other studies (e.g., Murdock, 
1962) have generally found a greater 
recency than primacy effect, Mur- 
dock points out that the studies in 
which Ss were instructed to recall the 
words in the order that they were pre- 
sented (as in the present study) have 
not obtained a greater recency than 
primacy effect (e.g., Bousfield, Whit- 
marsh, & Esterson, 1958). The failure 
to find a serial-position effect for 
those groups given peg words (Groups 
1-4) is consistent with the finding 
that a serial-position effect is not ob- 
tained when a paired-associate list is 
presented in a constant serial order 
(Battig, Brown, & Nelson, 1963). 

The number of intrusions per list 
and the intrusion rate per lis& were 
computed to determine if the condi- 


tions differed with respect to the 
eliciting of erroneous responses. The 
mean number of intrusions per list 
was 2.93, 3.00, 2.91, 2.83, 1.65, 1.33, 
and 2.04 for Groups 1-7 respectively. 
A Duncan’s range test on the total 
number of intrusions revealed that 
the groups given peg words (Group 1- 
4) had significantly (p < .05) more 
intrusions than the two groups told to 
use bizarre imagery but not given peg 
words (Groups 5 and 6). No other 
differences were significant. However, 
the mean number of intrusions may 
not be the most appropriate measure 
of overt errors because Groups 14 
had greater opportunity for intrusions 
in that they had higher mean recall 
than the other groups. To equate for 
total recall, an intrusion-rate measure 
was obtained by dividing the number 
of intrusions by the total number of 
responses made (both correct and in- 
correct) for each S. The mean intru- 
Sion rate was .105, .097, .096, .092, 
-090, .056, and .080 for Groups 1-7 
respectively. A Duncan's range test 
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indieated that Group 6 had a lower 
intrusion rate than Groups 1-3 (p < 
.05); all other differences were not 
significant. Thus, although the in- 
trusion and intrusion-rate measures 
are not in complete agreement, these 
measures suggest that Ss instructed to 
link successive response words during 
the learning trial have fewer intru- 
sions and a lower intrusion rate than 
Ss provided with and instructed in the 
use of peg words. 

An analysis of variance (Latin- 
square design) was made on the num- 
ber of correct responses to determine 
whether there was a differential prac- 
tice effect as a function of the strategy 
Ss were instructed to employ to learn 
the response words. The summary 
table of this analysis is presented in 
Table 1. The analysis of the ordinal 
position effect revealed a significant 
main effect and a Groups X Ordinal 
Position interaction. Since an exami- 
nation of Figure 1 suggested that 
those groups given peg words showed 
an improvement with practice, while 
those not given peg words did not, an 
orthogonal comparison was made to 
determine if the significant Groups X 
Ordinal Position interaction was due 
largely to greater practice effects for 
groups given peg words than for those 
groups not given peg words. That is, 
Groups 1-4 were compared with 
Groups 5-7 (Comparison A, Table 1) 
to test the notion that the presence of 
the peg words might account for 
much of the differences in practice ef- 
fects. As can be seen in Table 1, this 
comparison accounts for a considera- 
ble amount of the Groups X Ordinal 
Position interaction. Moreover, the 
comparison is significant for both the 
linear and quadratic components of 
the trend. 

] It should be noted that an assump- 
ton of the Latin-square design may 
have been violated in the present 


TABLE 1 
ANALYSIS OF EXPERIMENT I 
Source of variance df MS F 
Groups (G) 6 |1441.21 | 18.25*** 
Sequence (8) 5 14.88 | — 
x 30 | 147.12 | 1.80* 
Subjects/G,S 84 78.95 
Lists (L) |. 2 12.76 | 1.08 
Ordinal position (OP) (2) | (162.20) | (21.40)*** 
Linear | 1 | 262.10 |34.58*** 
SOE am | do:asy | Gio 
9: "50)** 
Éetween G linear ral tires} Ga 
Comp. A 1 43.89 | 5.79* 
Residual 5 2.36 | 1.08 
Between G quadratic (6) | Q1.14) | Q.79* 
Comp. 1 .22 | 4.88* 
Residual 5 18.72 | 2.47* 
GXL 12 11.51 | 1.51 
Square uniqueness 8 25.27 | 3.33"** 
G X Square uniqueness 48 12.58 1.66* 
Pooled residual 168 7.58 
Total 377 


experiment. The model for the analy- 
sis in Table 1 assumes that there is 
no interaction between ordinal posi- 
tion and lists. If this interaction, in 
fact, exists, then the ordinal position 
effect might be confounded. A partial 
test for this interaction is to test 
square uniqueness with the pooled 
residual. Since square uniqueness is 
significant, there is some reason to 
question the assumption of no interac- 
tion between ordinal position and 
lists. However, there is also reason 
not to be too concerned in that all 
possible sequences (complete counter- 
balancing of lists and ordinal posi- 
tion) were used in the present experi- 
ment. Grant (1948) maintains that 
when there is complete counterbal- 
ancing there is no need to be con- 
cerned with the assumption of no in- 
teraction between ordinal position 
and lists. However, since subsequent 
experiments of this series afford ad- 
ditional information concerning the 
influence of the ordinal position vari- 
able, further consideration of this is- 
sue may be unnecessary. 

In summary, the results of Experi- 
ment I indicated that, with the pres- 
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ent materials, providing Ss with peg 
words or instructing them to link 
successive response words during 
presentation resulted in superior re- 
call relative to the appropriate con- 
trol, The use of peg words enabled Ss 
to recall words in the same order that 
they were presented; correct ordering 
of recall was not obtained when peg 
words were not provided. There was 
some indication that instructing Ss to 
link successive response words re- 
sulted in fewer intrusions and a lower 
intrusion rate than providing them 
with and instructing them in the use 
of peg words. The serial-position 
curves obtained were generally in 
agreement with the results of other 
investigators. Although there was 
some question whether a necessary 
assumption of the underlying sta- 
tistical model had. been satisfied, the 
results suggested that groups given 
peg words improved with practice, 
but groups not given peg words did 
not. 


ExPERIMENT II 


The purpose of this experiment was 
to investigate how the performance of 
Ss employing a mnemonie system is 
influenced by presentation time per 
item. It was believed that Ss required 
to form images or utilize mediators 
may benefit more from a longer pres- 
entation time per item than Ss not 
required to form images or use medi- 
ators, That is, if the use of mnemonics 
results in “better learning” and if us- 
ing mnemonics (forming images or 
utilizing mediators) requires more 
time than not using mnemonics, then 
Ss utilizing a mnemonic system should 
be able to benefit more from a slow 
presentation rate than Ss not employ- 
ing a mnemonic system. Although 
there is little evidence to indicate 
what the optimal presentation rate is 
(if, in fact, there is one) for Ss using 


mnemonics, evidence from the Wal- 
lace et al. (1957) study suggests that 
the optimal rate is slower than 3 sec- 
onds. That is, in the Wallace et al. 
study a practiced S required an av- 
erage of 3 seconds to make a “mne- 
monic” association between two 
words, Thus, it should follow that a 
presentation rate of over 3 seconds is 
likely to be more nearly optimal than 
a presentation rate of less than 3 sec- 
onds for relatively unpracticed Ss. 


Method 


Design. Two conditions were run in this 
experiment: a group given the peg words 
and complete imagery instructions (i.e., like 
Group 1 of Experiment I), and a group given 
standard recall instructions (i.e., like Group 
7 of Experiment I). Both groups were pre- 
sented the words at a rate of 2 seconds per 
word. Comparisons were made between these 
two groups and the corresponding groups of 
Experiment I. The design is a 2 X 2 factorial 
with instructions (mnemonic system) as one 
variable and the presentation time per word 
as the second variable. The prediction is 
that the interaction between these two 
variables will be significant such that there 
will be a greater difference in performance 
between the mnemonic system condition and 
the nonmnemonie system condition at the 
5-second presentation rate than at the 2- 
second rate. 

Procedure and materials. The procedure 
and materials were the same as they were 
for the corresponding groups of Experi- 
ment I, except that the response words were 
presented at a 2-second rather than a 5-sec- 
ond rate. Since only two conditions were 
used, Ss were run in groups of six, three Ss 
in each condition. The six Ss receiving the 
same order of the lists were run at the same 
time. Experiment II was run concurrently 
with Experiment I. That is, Ss were ran- 
domly assigned to serve in Experiment I or 
Experiment II solely as a function of their 
order of appearance in the laboratory. 


Results 


The mean number of words recalled 
a8 a function of stage of practice for 
Conditions 1 and 7 of Experiments I 
and II is presented in Figure 3. As 
expected, presentation rate (2 seconds 


——v— 


Mnemonic Systems IN RECALL 9 


$ 


s 


MEAN NUMBER CORRECT 
8 


[ 3 


2 
STAGE OF PRACTICE 


Fra. 3. Mean recall as a function of stage 
of practice for the four groups of Experiment 
IL. (Presentation rate—2 versus 5 seconds— 
and instruction condition—1 versus 7—were 
the variables manipulated.) 


versus 5 seconds) and instructional 
condition (Condition 1 versus 7) both 
significantly influenced performance 
(Table 2). The most important result 
of this experiment, however, is the 
significant interaction between Pres- 
entation Rate X Instruction Condi- 
tion. A graphical presentation of this 
interaetion is represented in Figure 4 
by plotting the mean recall over all 
lists as a function of presentation 
rate. Thus, this study indicates that 


TABLE 2 
ANALYSIS OF EXPERIMENT II 
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Source of variance df MS F 
Presentation rate (PR 1 | 2787.85 | 43.75*** 
Instructions (I) iden 1 |2373.41 | 37.25*** 
Sequence (8) 5 44.83 | — 
PRX 1 541.50 | 8.50** 
PRXS 5 55.24| — 
IXS 5 | 226.95 | 3.56** 
mu XI rs 8 5 TD 1.92 

ubjects/PR,I, 48 $ 
Ordinal position (OP. 2 | 126.46 | 18.12*** 
Lists (L) oe 2 9.86 | 1.41 
PR X OP 2 6.80] — 
1x OP 2 6.36 | — 
PRXL 2 9.34 | 1.84 
IXL 2 17.50 | 2.51 
1X PR x OP 2 13.55 | 1.94 
IXPRXL 2 612| — 
ipeum | do dh Ee 

quare uniqueness E s 

Pooled residual 96 6.98 

‘otal 215 


the relative effectiveness of a mne- 
monic system strategy depends in 
part on the rate of presentation of the 
items. 

The failure to find an interaction 
between instruction condition and 
ordinal position in this experiment 
(see Table 2) and the presence of this 
interaction in Experiment I suggests 
that presentation rate may be a factor 
determining whether Ss utilizing a 
mnemonic system improve with prac- 
tice. When a 5-second presentation 
rate was used, Ss in Condition 1 im- 
proved with practice; however, Ss in 
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Fio. 4. Mean recall over all three lists 
for Instruction Conditions 1 and 7 at two 
presentation rates. 
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Condition 1 did not show an improve- 
ment with practice when a 2-second 
presentation rate was used. Condition 
1 of Experiment I and Experiment II 
were analyzed to determine if the 
apparent improvement with practice 
depended on presentation rate. Since 
this interaction was not significant 
(F = 1.18, df = 2/48, p > 05), the 
notion that improvement with prac- 
tice for Ss in Condition 1 depended on 
a slow rate of presentation was not 
supported. 


Experment II 


The purpose of this experiment was 
to compare à mnemonic system group 
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(namely, instructed to use a bizarre 
image to connect each peg and re- 
sponse word, and cautioned not to re- 
hearse previously formed images) and 
a mediation group (namely, instructed 
to find a verbal means, such as using 
a third word as a mediator, to con- 
nect each peg and response word), in 
a negative transfer paradigm, Nu- 
merous studies (e.g., Twedt & Under- 
wood, 1959; Underwood, 1951) have 
demonstrated that Ss learning two 
different successive responses to the 
same stimulus (A-B, A-D) have a 
more difficult time with the second 
pair than Ss learning a different re- 
sponse to each of two stimuli (A-B, 
C-D). Since the mnemonic system 
and mediation groups of Experiment 
I were presented a different peg list 
for each response list (A-B, C-D), in 
Experiment III it was only necessary 
to measure the performance of a 
mnemonic system group and a medi- 
ation group having the same peg list 
for the three response lists. Then, by 
comparing the performance of the 
mnemonic system groups and media- 
tions groups of Experiment I and 
Experiment III it is possible to de- 
termine whether a mnemonic system 
Strategy or mediation strategy results 
in the greatest negative transfer. That 
is, it should be possible to determine 
whether the effect of having the same 
pegs for consecutive lists of response 
words produces a greater negative ef- 
fect when S uses a bizarre image to 
connect each peg and response word 
or when S uses some form of verbal 
mediation. 

The interfering effects of the first 
list are expected to be greater in the 
mediation condition, thus resulting in 
greater negative transfer for the 
mediation condition. The reasoning 
for the prediction, although there is 
no evidence to support it, is based on 
the claim made by the advocates of 


mnemonic systems (e.g., Roth, 1961) 
that the same peg words can be used 
over and over again without obtain- 
ing interference from previously 
learned response words. 


Method 


A mnemonic system condition (ie., like 
Group 1 of Experiment I) and a mediation 
condition (ie. like Group 4 of Experiment 
I) were the only groups utilized. The pro- 
cedure was identical to the procedure of Ex- 
periment I except that the peg words for 
any one § were the same for each of the 
three different response lists. One-third of 
the Ss (N = 6) in each condition had Peg 
List 1, one-third Peg List 2, and one-third 
Peg List 3. As was pointed out earlier, these 
two conditions were run at the same time as 
the seven conditions of Experiment I. 


Results 


The mean number of words cor- 
rectly recalled for the four groups as 
a function of stage of practice is pre- 
sented in Figure 5. This figure clearly 
indieates that the performance of Ss 
given the same peg words for succes- 
sive lists (i.e, A-B, A-D) decreased 
with practice while the performance 
of Ss given -different peg words for 
successive lists (i.e, A-B, C-D) in- 
creased with practice. These differ- 
ences were significant (Table 3), that 
is, the ordinal position (practice) 
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Fic. 5. Mean recall as a function of trans- 
fer paradigm (A-B, C-D versus A-B, A-D) 
and stage of practice for Instruction Con- 
ditions 1 and 4 of Experiment III. 
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variable and the Ordinal Position X 
Paradigm interaction were significant. 
Tt should be noted that the test of the 
assumption of no interaction between 
ordinal position and lists was positive 
for this experiment, indicating that 
the Latin-square analysis was appro- 
priate. The F values for square 
uniqueness and Groups X Square 
Uniqueness failed to reach signifi- 
cance. 

Although there was positive trans- 
fer with the A-B, C-D paradigm and 
not with the A-B, A-D paradigm, 
there was no evidence of differential 
transfer as a function of instruction 
condition (Table 3). The prediction 
that Ss using imagery would be able 
to use the same peg words to learn 
successive lists without obtaining in- 
terference effects was not supported. 
In addition, there was no evidence 
that the absolute recall level differed 
for the two instruction conditions. 
An analysis of the number of words 
recalled in the correct position, the 
number of intrusions and the intru- 
sion rate also failed to reveal any dif- 
ferences between the bizarre imagery 
and mediation condition. 


TABLE 3 
ANALYSIS or ExPERIMENT III 

Source of variance df MS F 
Paradigm (P) 1 | 352.67 | 3.21 
Instructions (I) 1 5.35 | — 
Sequence (S) 5 254.33 | 2.91 
PXI 1 75.85 | — 
PXS 5 26.60 | — 
IXS 5 32.09 | — 
PXIXS 5 102.25 | — 
Subjects/P,1,8 48 | 100.88 
Ordinal position (OP) 2 36.81 | 3.15* 
Lists (L) 2 2.24 
PXOP 2 | 123.18 | 10.53*** 
IX OP 2 5.84 | — 
PXL 2 42.89 | 3.07* 
IXL 2 20| — 
IXP XOP 2 28 E 
IXPXL 2 4.96 | — 
Square uniqueness 8 18.74 | 1.60 
iene | B ER 

led residu: 9 . 
Total 215 


adc 208.2 ante ee ee 
“p< .05. 
seep < 001. 


Experiment IV 


This experiment was conducted to 
determine if instruction condition 
would interact with the type of list 
used. An interfering (I) and a media- 
tion (M) list were constructed by us- 
ing normative data. Because of the 
nature of the words selected, Ss em- 
ploying a mediation approach (ie., 
instructed to verbally connect each 
peg and response word) were expected 
to do well on List M and poorly on 
List I. That is, it was expected that 
Ss would have little difficulty in 
thinking of words to mediate the peg 
and response word pairs in List M, 
but experience considerable difficulty 
with the pairs of List I. Although a 
mediation group was expected to do 
better on List M than List I, there 
was no particular reason to suspect 
that Ss employing imagery should 
perform differentially on the lists. The 
type of list learned was expected to 
interact with the method used to 
learn the list such that there would 
be a marked difference between the 
performance on the two lists under 
one method (verbal mediation) and 
not under the other (bizarre imagery). 


Method 


All 36 Ss in each of the two conditions, 
bizarre imagery and verbal mediation, 
learned the two lists using & procedure es- 
sentially identical to the procedure used in 
Experiments I-III. The Ss were given a four- 
page handout consisting of a cover page, 
special instructions regarding strategy, and 
two answer sheets. A list of peg words was 
printed on each answer sheet. Each response- 
word list was presented on a tape recorder 
at a rate of 5 seconds per word. The order 
of presentation of the response lists was 
counterbalanced. 

List I was constructed by using 40 stim- 
ulus words (e.g., TABLE) and the dominant re- 
sponse to each stimulus word (e.g. choir) 
from the Minnesota norms (Russell & Jen- 
kins, 1954). The stimulus words were used 
as peg words and the dominant responses 
were used as response words. However, the 
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peg words and response words were re- 
paired. For example, the response word to 
the peg word TasLE was crackers instead of 
chair. The learning of this list was expected 
to be difficult for Ss using a mediational ap- 
proach because the dominant response to 
each peg word may be an incorrect response. 
If raste elicits chair, chair might interfere 
with the learning of the correct response 
word, crackers. 

List M was constructed by using 80 words 
(40 pairs) from the Underwood and Richard- 
son sensory impression norms. Pairs of words 
were selected which elicit the same sensory 
impression response (e.g. CIGAR and ETHER 
both elicit smelly as a dominant sense im- 
pression). The Ss using a mediational ap- 
proach were expected to do well on List M 
because previous pilot work using a similar 
list indicated that Ss are able to supply ap- 
propriate mediators. For example, Ss asked 
to learn the response cigar to the stimulus 
ETHER reported using the mediator smelly. 
Lists I and M are presented in Table A2, 
Appendix A. 


Results 


The mean number of words re- 
called for Condition 1 (bizarre im- 
agery) was 27.22 and 33.00 for Lists 
I and M respectively. For Condition 
4 (verbal mediation) mean recall 
was 25.17 and 30.89 for Lists I and 
M respectively. List M was signifi- 
cantly easier than List I (F = 56.84, 
df = 1/32, p < .01). The only other 
significant effect was that of ordinal 
position (F = 10.04, df = 1/32, p < 
01), Since there was an increase in 
performance as a result of practice, 
this experiment is consistent with the 
earlier three experiments in demon- 
strating a practice effect for those 
groups given peg words and a 5-second 
presentation rate. The predicted in- 
teraction between instruction condi- 
tion (bizarre imagery versus verbal 
mediation) and type of list (I versus 
M) was not obtained. Therefore, the 
results of this study are in agreement 
with the results of Experiment III in 
that both experiments failed to reveal 
any differential performance as a 


function of instruction condition (i.e., 
main effect or interactions). 

An analysis was made of the num- 
ber of words correctly paired, the 
number of intrusions, and the intru- 
sion rates to determine whether any 
of these measures differentiated the 
two instruction conditions. As in the 
previous experiments, most of the 
words recalled by Ss using bizarre 
imagery or verbal mediation were re- 
called in the correct position. The 
mean number of misplaced words 
was .44 and 1.54 for Conditions 1 and 
4 respectively; this difference is not 
significant. The number of intrusions 
and the error rates did not differ for 
the two conditions. These findings sug- 
gest that the instruction conditions of 
Experiments III and IV are func- 
tionally equivalent in that the per- 
formance of Ss instructed to use 
bizarre imagery did not appear to 
differ in any way from Ss instructed 
to use a verbal mediation strategy. 


EXPERIMENT V 


Paivio (1965) tested the assump- 
tion that concrete (C) nouns (eg. 
string, tree, coffee) are superior to 
abstract (A) nouns (e.g., idea, mo- 
ment, soul) in their capacity to elicit 
sensory images, and that imagery can 
Mediate the associative connection 
between items in a pair. Paivio pre- 
dicted that learning would be better 
if the concrete noun were in the 
stimulus position than if it were in the 
Tesponse position because the tend- 
ency of the word to arouse mediating 
imagery should be greater when the 
word is in the stimulus position. That 
is, if sensory images can effectively 
Mediate the learning of paired-asso- 
Ciate pairs, and if one assumes that 
the sensory images elicited by the 
stimulus term are more crucial to 
paired-associate learning than the 
Sensory images elicited by the re- 
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sponse term then it follows that C-A 
lists should be easier to learn than 
A-C lists. 

Paivio predicted that the ease of 
learning four stimulus-response com- 
binations constructed from concrete 
and abstract. nouns would be in the 
order C-C, C-A, A-C, A-A. This 
prediction was supported. Also the 
ratings for the words on the ease with 
which they aroused sensory images 
indicated, as expected, that concrete 
nouns elicit images more readily than 
abstract nouns. This study, there- 
fore, strongly suggests that the vari- 
able of abstractness is crucial to any 
consideration of a mnemonic device 
employing imagery. Moreover, if im- 
agery is responsible for the superior 
performance on concrete lists, then it 
should follow that one must predict a 
greater facilitation of performance 
for concrete (vs. abstract) words 
under instructions to employ imagery 
than under nonimagery instructions. 
That is, the difference in performance 
between concrete and abstract lists 
should be greater under a mnemonic 
system condition than under a verbal 
mediation condition. 


Method 


The design of this experiment was essen- 
tially identical to Experiments III and IV 
in that a bizarre imagery condition and a 
verbal mediation condition were utilized. 
The principal difference was in the lists. Four 
lists of concrete and abstract nouns were 
constructed by using the 240 (120 concrete 
and 120 abstract) words listed by Gorman 
(1961). The four lists of 30 pairs each repre- 
sented the four possible combinations of peg 
and response word abstractness (ie., cc, 
C-A, A-C, A-A). The 120 abstract and 120 
concrete words were randomly assigned to 
their four possible positions (2 peg-word 
lists and 2 response-word lists). The order 
of the words within any of the resulting 
paired-associate lists was random with the 
only restriction that any peg and response 
word pair did not start with the same letter. 
‘A total of 48 Ss (24 per condition) was pre- 
sented each of the four lists using a proce- 


dure essentially identical to that used in Ex- 
periments I-IV. Two Ss (one from each 


instruction condition) were run at the same 
time. The order of presentation of the four 
lists was completely counterbalanced. 


Results 


The mean number of words cor- 
rectly recalled, disregarding order of 
recall, for the bizarre imagery condi- 
tion, was 19.92, 18.25, 16.46, and 8.62 
for the C-C, C-A, A-C, and A-A lists, 
respectively. Mean recall for the ver- 
bal mediation condition was 20.58, 
17.83, 17.04, and 10.42 for the C-C, 
C-A, A-C, and A-A lists respectively. 
A Latin-square design analysis of 
variance of these scores is presented 
in Table 4. As expected, word ab- 
stractness (C-C versus A-A) had & 
strong influence on amount recalled. 
The failure to find a significant su- 
periority of the C-A over the A-C 
list (locus of abstractness) is not en- 
tirely consistent with Paivio’s (1965) 
findings, but this failure to replicate 
could be due to procedural differences. 
In the present experiment the de- 
pendent measure was the number 
correctly recalled on the first (and 
only) recall trial; in the Paivio study 
the dependent measure was the num- 
ber correctly recalled on the first four 
recall trials. It is possible that one 
trial does not provide sufficient time 
for a C-A over A-C superiority to be 
manifested. Since different words 
were used for the four lists it might 
be that the words comprising the 
C-A and A-C lists were of unequal 
difficulty. That is, even though the 
words were assigned to the four lists 
on a random basis, the concrete or 
abstract words of one list may have 
been easier than those of another 
list. Yet the fact that the results of 
the present experiment are so similar 
to Paivio’s results suggests that this 
was not the case. As in the Paivio 
study, the ease of learning the four 
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TABLE 4 
ANALYSIS OF EXPERIMENT V 
Source of variance df MS F 
Gi G 1 13.54] 1.53 
Sera i 23 61.58 |  6.99*** 
Ordinal position (OP) 3 91.83 | 10.42*** 
ists (L) (8)  |(1020.64)| (115. 85)*** 
Abstractness (A) 1 [2762.76 | 318.29*** 
Eater i 3838 | sire 
RE Bau 23 110.59 | 12:55** 
G X OP 3 7.23 | — 
GXL G) | q4.82)| (1.68) 
Gxa 1 7.69) = 
G X Locus of A T 13.50 | 1.53 
G X A X Locus of A 1 23.97 | 2.05 
Pooled residual 132 8.81 
Total 191 
*** p < 001. 


lists in the present experiment was in 
the order C-C, C-A, A-C, A-A. Also, 
the superiority of the C-A over the 
A-C list approached significance (.10 
> p > .05). In any case, the pri- 
mary intent of the present study was 
not to replicate Paivio’s findings, 
but to test the notion that instruc- 
tion conditions could produce differ- 
ential performance on the lists. 
Instruction condition (Groups) and 
the interactions of instruction condi- 
tion with the list variables (i.e., 
Groups X Abstractness, Groups x 
Locus of Abstractness, and Groups x 
Abstractness X Locus of Abstractness) 
failed to reach significance. Thus, 
these results are consistent with the 
earlier experiments of this series in 
failing to demonstrate a differential 
effect of instructions to use verbal 
mediation versus instructions to use 
bizarre imagery. However, there is 
slight support for Paivio’s notion that 
the tendency of a word to arouse 
mediating imagery should be greater 
when the word is in the stimulus posi- 
tion. A comparison of the mean recall 
for the bizarre imagery and verbal 
mediation conditions reveals that the 
bizarre imagery condition was su- 
perior (though not significantly so) 
to the verbal mediation condition 
only for the C-A list. Also, the dif- 


ference in performance between the 
C-A and the A-C lists was greater 
for Ss in Group 1 (bizarre imagery) 
than Group 4 (verbal mediation). 
Thus, instructions to use imagery 
slightly increased a difference be- 
lieved to be caused by the use of 
imagery. 

The significant interaction between 
Abstractness X Locus of Abstract- 
ness (under the heading Lists in 
Table 4) is a rather interesting find- 
ing in that it suggests that increasing 
the concreteness of the stimulus or 
the response of an A-A list (i.e., 
comparing an A-A list with an A-C 
or C-A list) produces a greater 
change in performance (facilitation) 
than the change obtained by decreas- 
ing the concreteness of a C-C list 
(i.e., comparing a C-C list with an 
A-C or C-A list). 

The finding of a significant ordinal 
position effect is consistent with the 
earlier experiments in this series. Once 
again, there was a significant increase 
in performance as a function of prac- 
tice. An analysis was made of the 
number of words correctly paired, 
the number of intrusions, and the in- 
trusion rates to determine whether 
any of these measures differentiated 
the two instruction conditions. None 
did. 


Discussion 


Brief Summary of Results 


The main findings of the five ex- 
periments were as follows: Groups 
given peg words and the group in- 
structed to link successive items of 
the list with a bizarre image per- 
formed better than a control group 
given standard free-recall instruc- 
tions. The groups which differed only 
with respect to rehearsal instructions 
had essentially equivalent recall; the 
groups which differed only with re- 
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spect to bizarreness (i.e., common ver- 
sus bizarre images) instructions had 
essentially equivalent recall Recall 
scores on relatively high frequency 
concrete noun lists (Experiment I), à 
negative transfer paradigm (Experi- 
ment III), a high interference list (Ex- 
periment IV), a “mediation” list 
(Experiment IV), and lists of varying 
degrees of word abstractness (Experi- 
ment V) were not significantly in- 
fluenced as a function of whether the 
instructions were to link peg and re- 
sponse words by verbal mediation or 
by bizarre images. A significant inter- 
action was obtained between Instruc- 
tion Condition X Presentation Rate 
(Experiment II) in that the difference 
in performance between a group given 
peg words and instructed to use bizarre 
imagery and a group given standard 
free-recall instructions was greater at 
a 5-second presentation rate than a 2- 
second presentation rate. There was 
only very limited support for Paivio’s 
notion that imagery is the factor ac- 
counting for the superiority of 
paired-associate lists consisting of con- 
crete-abstract word pairs over ab- 
stract-conerete word pairs since this 
superiority was only slightly increased 
by instructing Ss to use imagery to 
link the words. Finally, Ss given 
peg words generally showed an im- 
provement with practice while those 
not given peg words did not, Addi- 
tional comments are necessary for 
some of these findings. 


Peg Words 


First, a procedural question needs 
to be briefly considered. Rather than 
have Ss memorize the peg list(s) 
prior to the presentation of the re- 
sponse lists, Ss were presented with 
the peg words during both the learn- 
ing and recall trial. Since this is not 
the way a mnemonic system is typi- 
cally used by “mnemonic experts” or 


by other investigators of mnemonics 
(e.g., Smith & Noble, 1965), generali- 
zations from this study may be 
limited. Yet, despite the interpreta- 
tional limitations, there are several 
advantages to the procedure utilized. 
If Ss in the experimental conditions 
had been given special training or 
practice on a system, this experience 
may have produced differential moti- 
vational effects and/or nonspecific 
transfer effects on the test lists. But, 
most important, by making the peg 
list(s) available, the possibility of 
facilitating recall with a mnemonic 
system, if, in fact, the effective 
utilization of a mmemonic system 
does result in superior recall, is maxi- 
mized. The argument is that a test of 
a particular strategy or system should 
begin by utilizing those conditions 
which are allegedly ideal for the 
functioning of the system. 

Why does the presence of peg words, 
under the present conditions, facili- 
tate recall? It is apparent, since peg 
words and response words were con- 
sistently paired correctly, that the 
peg words served as stimuli for the 
words to be recalled. Since the pro- 
viding of a stimulus for each response 
word, in effect, makes the free-recall 
task a paired-associate task, the prob- 
lem is explaining why, with the pres- 
ent materials, a paired-associate task 
should be easier than a free-recall 
task. Was the presence of the stimulus 
words solely responsible for the facil- 
itated recall, or were the strategies 
utilized by the Ss also an important 
factor? It is possible that the strategy 
S used to link peg and response 
words was primarily responsible for 
the facilitated recall. That is, Ss 
given peg words and standard paired- 
associate instructions may only have 
performed at a level equal to the 
groups not given peg words. Since a 
group given standard paired-associate 
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instructions was not utilized, there is 
no evidence as to whether the use of 
imagery or mediation would facilitate 
performance relative to a control 
given standard paired-associate in- 
structions. 

Other investigators have attempted 
to determine how mediation instruc- 
tions affects the learning of paired- 
associate lists. Although Garskof, San- 
dak, and Malinowski (1965) were 
able to facilitate the performance of 
Ss given two successive lists (A-B, 
A-D paradigm) by instructing Ss to 
mediate, it was not clear from their 
study, since they only reported the 
mean performance on the two lists 
combined, whether the instructions 
facilitated performance on both the 
A-B and A-D list or merely on the 
A-D list. The present concern is not 
with how instructions to mediate in- 
fluence transfer; the present concern 
is whether instructions to mediate 
facilitate the acquisition of a single 
list. Martin and Dean (1966) were 
unable to facilitate first-list learning 
by giving mediation instructions; Me- 
Nulty (1966) was unable to facili- 
tate the performance of motivated 
Ss on a single list by instructing 
them to mediate. In light of the 
absence of a clear demonstration that 
instructions to mediate facilitate the 
acquisition of a single list for normal, 
motivated Ss, and the failure of the 
different instruction conditions of the 
present experiments to influence per- 
formance, it is likely that instructing 
Ss to use a particular strategy has, at 
best, only a slight effect on the acqui- 
sition of a single paired-associate list. 

Before generalizing that providing 
Ss with a set of stimuli (peg list) 
facilitates recall it would seem neces- 
sary to have Ss learn other kinds of 
lists. For example, would a list which 
required a considerable amount of 
response learning (such as a low- 


meaningfulness nonsense-syllable list) 
be facilitated by the presentation of 
a peg-word list? The point is that in 
the present study response learning 
was minimal because the words were 
of high meaningfulness, therefore, the 
associative stage was the principal 
constituent of the learning. However, 
it might be expected that providing 
Ss with a peg list would not facilitate 
recall if the material to be learned 
required a considerable amount of 
response learning. An interaction be- 
tween meaningfulness and the pres- 
ence or absence of a peg list should 
occur such that the superiority of 
the peg-list condition relative to the 
control would be greater for lists of 
high meaningfulness than for low 
meaningfulness, In fact, evidence 
from the present study indicates that 
this interaction might be obtained 
even when words are used for the peg 
and response lists. Although the ap- 
propriate control group was not run, 
the low mean recall on the abstract- 
abstract list (8.62 and 10.42 for Con- 
ditions 1 and 4 respectively) of Ex- 
periment V strongly suggests, to the 
extent that abstractness and meaning- 
fulness are correlated, that the su- 
periority of a peg-list condition rela- 
tive to a control not given a peg list 
decreases as the meaningfulness of the 
peg and response words decreases. 

If, as in the present study, provid- 
ing Ss with a stimulus facilitated re- 
call of high meaningful word lists, it 
should follow that other attempts to 
“provide” or “define” the stimulus 
might also result in facilitated per- 
formance. Thus, instructing Ss to use 
contextual objects or other items in 
the list as stimuli should result in 
facilitated recall. There is also rea- 
son to suspect that the nature of the 
stimuli might determine whether or 
not supplying the stimuli for the re- 
call of the response words facilitates 
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performance, that is, compare the 
performance on the concrete-abstract 
and abstract-abstract lists of Experi- 
ment V. Also, it has yet to be estab- 
lished whether numbers or the alpha- 
bet can serve as effective stimuli for 
recall when the materials to be re- 
called are not specifically designed to 
conform to a numerical or alphabeti- 
cal sequence. If numbers or the alpha- 
bet can serve as effective stimuli, a 
peg list is, of course, readily availa- 
ble for the facilitation of recall. 

Besides the present study, there is 
apparently only one other reported 
instance of facilitated “free-recall” 
learning resulting from instructing 
Ss on a particular strategy. Tulving 
(1962) demonstrated that instructing 
Ss to use an alphabetic organization 
of the items produced significantly 
superior recall relative to a control 
group given standard free-recall in- 
structions. The Ss in the instructed 
condition were told to make associa- 
tions between each letter of the al- 
phabet and the response word having 
that initial letter. Each response word 
started with a different letter. Thus, 
Ss in effect were given a peg list 
(since the alphabet is highly availa- 
ble) and told to make associations 
between each peg and response word. 
The present study extends the Tulving 
finding in that the peg lists and re- 
sponse lists for the present study were 
randomly selected from a pool of 
words and not limited to the alphabet 
or to sets of words all having different 
initial letters. 


Linking Successive Items 


There are at least two ways to in- 
terpret the finding that instructing 
Ss to link successive items of the list 
with a bizarre image resulted in su- 
perior recall relative to a control. 
Deese (1960) maintains that the 
inter-item associative strength of the 


words in the list determines amount 
of recall because words that are as- 
sociated elicit each other at recall. 
Underwood and Schulz (1960) sug- 
gest that response recall is a matter 
of increasing the strength of the re- 
sponse words to the contextual situa- 
tion so that the response strengths of 
items in the list are greater than 
other responses that are not in the 
list. The major difference between the 
two positions appears to be whether 
the emphasis is on the stimulus elicit- 
ing the response or increasing the re- 
sponse strength so that the response 
can be elicited by contextual stimuli. 
That is, Underwood and Schulz em- 
phasize the frequency of elicitation of 
each item in the list. Deese empha- 
sizes the associational connections 
among items in the list. Since the in- 
structions to associate successive 
items resulted in superior recall it 
might appear that Deese’s position 
was unequivocally supported. How- 
ever, it can also be maintained that 
instructions to associate items in- 
crease the frequency of elicitation of 
the items in that Ss are “forced” to 
rehearse earlier items in order to 
make associations among the items. 
Fortunately, it appears a differential 
prediction can be made from the 
above two positions regarding the 
order of recall of the items. It fol- 
lows from Deese’s point of view that 
items which have been associated 
during learning should occur together 
in recall if, in fact, words that are 
associated elicit each other during re- 
call, However, if frequency is the 
crucial variable then there is no rea- 
son to suspect that items occurring to- 
gether during learning should neces- 
sarily occur together at recall since 
no prediction regarding order of recall 
follows from the frequency principle. 
Thus, if an increase in inter-item as- 
sociative strength was responsible for 
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the facilitated recall of the group in- 
structed to link successive items dur- 
ing learning, the order of recall for 
this group should, relative to a con- 
trol, closely approximate the order of 
presentation of the list. 

To test the above prediction the 
order of recall of the group told to 
link successive items and the group 
given standard free-recall instructions 
(Experiment I) was examined. The 
number of pairs of words occurring 
successively in recall that were pre- 
sented successively during the learn- 
ing trial was counted for these groups. 
The order of the words within the 
pair was ignored, that is, if the first 
and second words presented were re- 
called in reverse order (second word 
first and first word second) this was 
still counted as a cluster. An S having 
perfect clustering would have one less 
cluster than the number of words cor- 
rectly recalled. For each S in the two 
conditions the total number of clusters 
was obtained for each list and di- 
vided by the total number of words 
correctly recalled to yield a cluster 
rate. A mean cluster rate was ob- 
tained for each S over all three lists. 
The mean for the Ss in the group in- 
structed to link successive items was 
.506; the mean for the Ss in the con- 
trol was .377. This difference was sig- 
nificant (F = 5.09, df = 1/34, p < 
.05). Thus, this analysis supported 
Deese’s notion that inter-item asso- 
ciative strength determines amount 
recalled in that the order of recall 
was shown to reflect the associative 
connections that Ss were instructed to 
make during the learning trial. It is 
of course possible that the Underwood 
and Schulz frequency notion could be 
extended to include a prediction re- 
garding order of recall. It seems un- 
likely, however, that this could be 
done without ascribing an eliciting 
function to the words in the list. 


Rehearsal 


Although the instructions to re- 
hearse produced a slight, though in- 
significant, facilitatory effect, the in- 
terpretation of these results must be 
extremely guarded because there is 
some doubt as to whether or not the 
instruction conditions actually pro- 
duced a difference in rehearsal. If the 
group told to rehearse previously 
formed images whenever possible did 
not have time to do so, then the re- 
hearsal variable was not manipulated 
in the present study. Unfortunately, 
even though the Ss, when queried 
about whether they were able to fol- 
low their special instructions, re- 
ported that they were able to follow 
the instructions, no measure was ob- 
tained of the extent to which they re- 
hearsed previous items. In short, no 
definitive statement regarding the in- 
fluence of rehearsal can be made. 


Presentation Rate 


Investigators have reasoned that if 
Ss use mediation, optimal perform- 
ance should be obtained with an antic- 
ipation interval which is sufficiently 
long to give the Ss an opportunity to 
utilize mediators. Some investigators 
(e.g, Runquist & Marshall, 1963; 
Schulz & Lovelace, 1964) have ob- 
tained results which support this no- 
tion and others (e.g., Schwenn & Un- 
derwood, 1965; Spear, Mikulka, & 
Podd, 1965) have not. In light of the 
conflicting findings and the lack of an 
apparent reason for the conflicting 
findings, no general conclusion is 
warranted regarding the presence or 
absence of an optimal presentation 
rate when Ss use mediation. It should 
be noted that the above studies uti- 
lized transfer paradigms (A-B, A-B’; 
simulated A-B, A-B’; or A-B, C-D, 
A-D where a preestablished associa- 
tion between B and C was inferred 
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from association norms). Another ap- 
proach to the study of presentation 
rate and mediation would be to have 
Ss learn a single list with different 
presentation rates under mediation or 
nonmediation instructions. It seems 
plausible that the failure to demon- 
strate a reliable facilitory effect on a 
single list as a function of mediation 
instructions may be due to the failure 
to use slow presentation rates (5 sec- 
onds or more). In any case, for the 
conditions of the present study (Ex- 
periment II), the results clearly indi- 
cate that for Ss using peg lists a pres- 
entation rate of 5 seconds per item is 
more “nearly optimal” than a pres- 
entation rate of 2 seconds per item. 


Imagery as an Association Aid 


The present series of studies afford 
little support for the view that im- 
agery serves as an effective associa- 
tional aid in that instructing Ss to use 
imagery did not facilitate perform- 
ance relative to the group (verbal me- 
diation) not instructed to use im- 
agery. In Experiment V, even though 
the difference between the concrete- 
abstract and abstract-concrete list 
was greater for Ss in the imagery in- 
struction condition, the failure to ob- 
tain a significant interaction between 
the locus of abstractness (stimulus or 
response) and instruction condition 
suggests that either the utilization of 
imagery does not facilitate perform- 
ance under these conditions, or that 
the utilization of imagery cannot be 
effectively manipulated by instruc- 
tions. Thus, Experiment V does not 
provide convincing support for the 
notion that the superiority of con- 
crete-abstract lists to abstract-con- 
crete lists is due to imagery. 

Imagery has also been offered as 
an explanation for the superiority of 
noun-adjective lists to adjective-noun 
lists, as nouns are believed to elicit 


more images than adjectives. The 
rationale for the superiority of noun- 
adjective lists over adjective-noun 
lists is the same as for the su- 
periority of conerete-abstract lists 
over abstract-concrete lists (Lambert 
& Paivio, 1956). A recent study by 
Perrino (1965), however, gives con- 
siderable doubt to the notion that 
imagery is involved in producing this 
phenomenon, Perrino, among other 
things, embedded disyllable nonsense 
words in grammatical sentence frames 
by using English function words and 
verbs, The nonsense words were em- 
bedded in adjective and noun posi- 
tions. After Ss were given trials in 
which they learned to recognize the 
grammatical strings, they were given 
a paired-associate list in which the 
nonsense pairs, now presumed to have 
acquired form class (noun or adjec- 
tive) membership, served as the stim- 
uli and responses. The pairs having 
noun-adjective form class were 
learned more rapidly than the adjec- 
tive-noun pairs for the first six trials. 
Although the absolute differences 
were not large, grammatical structure 
appears to have been responsible for 
producing the effect. Since there seems 
to be little justification for maintain- 
ing that the nonsense disyllables 
differed on imagery, there is little rea- 
son to suspect that imagery func- 
tioned as an associational aid to 
produce these differences. Thus, if 
imagery functions as an associative 
aid, it is not the sole factor producing 
the noun-adjective, adjective-noun ef- 
fect and presumably the concrete- 
abstract, abstract-conerete effect. 


Nonspecific Transfer 

The finding of positive nonspecific 
transfer for Ss receiving peg words 
and the failure to find positive non- 
specific transfer (warm-up and/or 
learning to learn) for Ss not re- 
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ceiving peg words is consistent with 
the findings of other investigators. 
Since Ss receiving peg words in 
actuality are being given a paired- 
associate task while those Ss not 
given peg words have a free-recall 
task, the issue is why nonspecific pos- 
itive transfer is obtained using a 
paired-associate procedure and not 
with a free-recall procedure. With 
respect to paired-associate learning, 
the finding of positive transfer is 
consistent with other studies (e.g., 
Postman & Schwartz, 1964; Thune, 
1951). Yet for free recall the situa- 
tion is more complex. Dallett (1963) 
points out that learning to learn is 
difficult to obtain when only a few 
learning trials per list are given. 
Since only one trial was given per list 
in the present study, the failure to ob- 
tain learning to learn or warm-up ef- 
fects for those groups not given peg 
words is consonant with the findings 
of most other investigators (eg. 
Deese, 1957; Murdock, 1960). Yet, 
Murdock (1962) gave each S 20 lists 
per session, one trial per list, and four 
different sessions, and found improve- 
ment across sessions. However, the ef- 
fect was small for all six of his con- 
ditions and significant for only four 
of the six conditions. If only the first 
recall trial is considered, Tulving, 
MeNulty, and Ozier (1965) and Dal- 
lett, (1963) were also unable to dem- 
onstrate a learning to learn effect. 
Learning to learn has been obtained 
in free recall when Ss have been 
given repeated trials on the same list 
(Dallett, 1963; Meyer & Miles, 1953; 
Tulving et al., 1965). 


Implications of This Research 


The present study strongly suggests 
that the utilization of a well-mem- 
orized peg list can produce a marked 
facilitation, relative to a control, in 
the number of words that Ss are able 
to remember. Yet this does not neces- 


sarily mean that memory systems 
should be advocated. The decision to 
adopt a mnemonic system approach 
to the problem of how best to mem- 
orize verbal material should depend, 
hopefully, on more considerations 
than the demonstration that a par- 
ticular approach can, under some cir- 
cumstances, result in facilitated re- 
call. Although the present study. does 
not provide an unequivocal answer to 
the question of whether to adopt a 
mnemonic system approach, it gives 
at least partial answers for many of 
the considerations. 

There are compelling reasons for 
seriously considering a mnemonic sys- 
tem approach as a means for facili- 
tating memory. The present study 
suggests that: (a) The amount of re- 
call of concrete and perhaps abstract 
words is increased when a peg list is 
used; (b) the correct serial ordering 
of recall seems to be an inherent re- 
sult of a peg list approach; (c) ma- 
terial in the middle of a list does not 
appear to be any more difficult to 
learn than material at the ends of the 
list. The question of whether the re- 
call of abstract words is facilitated 
when a peg list is utilized requires 
further elaboration. Although a group 
given standard free-recall instructions 
(no peg list) was not employed in 
Experiment V, an estimate of the 
number of words such a group would 
have recalled can be made by utiliz- 
ing Murdock’s (1960, p. 231) form- 
ula. A recall of approximately 15 
words would be expected from a list 
of 30 words presented at a rate of 
one word every 5 seconds. Yet the 
two groups given peg lists consisting 
of concrete words had mean recalls 
of 18.25 and 17.33 for the list of 30 
abstract words. This evidence, albeit 
tenuous, suggests that the presence of 
a peg list can also facilitate the recall 
of abstract words. 

Several facts argue against the use 
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of a peg-list approach to memorizing 
verbal material. Obviously, a certain 
amount of time and effort is neces- 
sary in order to learn the peg list. 
Thus, unless the peg list is used to 
learn many lists, it would seem doubt- 
ful whether the approach would re- 
sult in an overall reduction in the 
amount of time required to memorize 
the material. Yet performance is 
likely to decrease as a function of the 
number of times the peg list was pre- 
viously used. That is, although there 
is improvement with practice if a dif- 
ferent peg list is used for each re- 
sponse list, the time required to mas- 
ter a peg list generally precludes 
memorizing more than one list. Given 
that only one peg list is used, the re- 
sults of Experiment III suggest that 
performance will decrease with each 
succeeding list. Thus, in the case in 
which lists of unrelated words are to 
be recalled, the effectiveness of a peg- 
list approach would seem to depend 
on whether Ss using a peg list con- 
tinue to be superior to control Ss when 
numerous (say several hundred) lists 
are learned. 

A final note of caution seems to be 
in order. Generalizations from the 
present study to practical learning 
situations should be guarded for at 
least two additional reasons. First, 
there is little evidence concerning the 
effectiveness of a mnemonic system 
approach when “organized” or com- 
plex material is to be memorized. 
Second, the present study does not 
afford any information regarding 
the long-term retention of material 
learned by means of & peg-list ap- 
proach, Yet, obviously, when many 
lists are learned by means of the same 
peg list the retention of these lists will 
be markedly hindered by retroactive 
and/or proactive interference. The 
extent of this interference relative to 
a group not utilizing peg lists remains 
undetermined. 


Summary 


A series of five experiments was 
conducted to determine under what 
circumstances, if any, mnemonic sys- 
tems facilitated recall, the elements of 
the system responsible for the “facil- 
itated” recall, and the relationships 
between mnemonic systems and some 
other variables known to affect learn- 
ing. The basic component of a mne- 
monic system is generally a list of peg 
words. During learning one makes 
an association between each word to 
be memorized and a peg word. Then 
for recall, the peg word, which is 
readily available, “elicits” the word 
to be recalled. In Experiment I the 
recall of a list of 40 words was com- 
pared for four groups provided with 
a peg list and three groups not pro- 
vided with a peg list. The four groups 
receiving peg lists differed with re- 
spect to the kind of association they 
were asked to make to connect each 
peg and response word. Two groups 
were told to use bizarre imagery. 
One group was told to use common 
imagery. The fourth group was told to 
use a form of verbal mediation. One 
of the groups told to use bizarre im- 
agery was instructed to rehearse pre- 
vious images; the other group was 
told not to rehearse. Of the three 
groups not given peg words, one 
group was told to make an image of 
each response word. The second group 
was instructed to link successive re- 
sponse words with a bizarre image. 
The third group, a control, received 
standard free-recall instructions. In 
Experiment II a 2-second and a 5- 
second presentation rate were used to 
determine if there was an optimal 
presentation rate for Ss utilizing 
mnemonics. In Experiments III-V a 
bizarre imagery group and a verbal 
mediation group were utilized to as- 
sess whether these groups would per- 
form differentially on a negative 
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transfer paradigm (Experiment III), 
a high interference or a “mediation” 
list (Experiment IV), and word tests 
of varying degrees of abstractness 
(Experiment V). 

The main findings of the five ex- 
periments were: Groups given peg 
words and the group instructed to 
link successive items of the list with 
a bizarre image performed better than 
the control given standard free-recall 
instructions. Rehearsal and “bizarre- 
ness” instructions had no apparent 
affect on performance, Recall scores 
on relatively high frequency concrete 
noun lists (Experiment I), a nega- 
tive transfer paradigm (Experiment 
III), a high interference list (Experi- 
ment IV), a “mediation” list, (Experi- 
ment IV), and lists of varying degrees 
of word abstractness (Experiment V) 
were not significantly influenced as a 
function of whether the instructions 
were to link peg and response words 
by verbal mediation or by bizarre 
images. A significant interaction was 
obtained between instructions and 
presentation rate (Experiment II) in 
that the difference in performance 
between a group given peg words and 
instructed to use bizarre imagery and 
a group given standard free-recall in- 
structions was greater at a 5-second 
presentation rate than a 2-second 
presentation rate. The Ss given peg 
words generally showed an improve- 
ment with practice while those not 
given peg words did not. 

The nature of the stimulus in re- 
call, the relationship between the 
presence of peg words and recall for 
materials of varying degrees of mean- 
ingfulness, a frequency versus an 
inter-item associative strength in- 
terpretation of free-recall learning, 
imagery versus verbal mediation, re- 
hearsal, presentation rate, imagery as 
an associational aid, and nonspecific 


transfer were the problems briefly 
discussed. 


REFERENCES 


Batasan, A. Uber den Unterschied des logis- 
chen und des mechanischen Gedachtnisses, 
Zeitschrift für Psychologie, 1910, 56, 356- 
377. 

Barrio, W. F., Brown, S. C., & Netson, D. 
Constant vs. varied serial order in paired- 
associate learning. Psychological Reports, 
1963, 12, 695-721. 

Bousrietp, W. A. WurrwansH, G. A, & 
EsrznsoN, J. Serial position effects and the 
“Marbe effect” in the free recall of mean- 
ingful words. Journal of General Psychol- 
ogy, 1958, 59, 255-262. 

Bucensx1, B. R. Presentation time, total 
time, and mediation in paired-associate 
learning. Journal of Experimental Psy- 
chology, 1962, 63, 409-412. 

Datierr, K. M. Practice effects in free and 
ordered recall. Journal of Experimental 
Psychology, 1963, 66, 65-71. 

Deese, J. Serial organization in the recall of 
disconnected items. Psychological Re- 
ports, 1957, 3, 577-582. 

Derese, J. Frequency of usage and the num- 
ber of words in free recall: The role of as- 
sociations. Psychological Reports, 1960, 7, 
337-344. ~ 

Drese, J., & Kavruan, R. A. Serial effects in 
recall of unorganized and sequentially or- 
ganized material. Journal of Experimental 
Psychology, 1957, 54, 180-187. 

Furst, B. The practical way to a better 
memory. New York: Fawcett World Li- 
brary, 1957. 

Gansxor, B. E., SANDAK, J. M., & MALINOW- 
SKI, E. W. A. Controlling the “fate” of 
first list associates. Psychonomic Science, 
1965, 2, 315-316. 

Gorman, A. M. Recognition memory for 
nouns as a function of abstractness and 
frequency. Journal of Experimental Psy- 
chology, 1961, 61, 23-29. 4 

Grant, D. A. The Latin square principle in 
the design and analysis of psychological 
experiments. Psychological Bulletin, 1948, 
45, 427-442. 

Lamezer, W. E., & Parvio, A. The influence 
of noun-adjective order on learning. Ca- 
nadian Journal of Psychology, 1956, 10, 
9-12, 


Mann, R. B., & Dean, S: J. Reported me- 
diation in paired-associate learning. Jour- 
nal of Verbal Learning and Verbal Be- 
havior, 1966, 5, 23-27. 


Mnemonic SvsrEMS IN RECALL 23 


McNutry, J. A. The effects of “instructions 
to mediate” upon paired-associate learn- 
ing. Psychonomic Science, 1966, 4, 61-62. 

Moyer, D. Rọ, & Muss, R. C. Intralist-in- 
terlist relations in verbal learning. Journal 
of Experimental Psychology, 1953, 45, 
109-115. 

Munpocx, B. B. The immediate retention of 
unrelated words. Journal of Experimental 
Psychology, 1960, 60, 222-234. 

Monrvock, B. B. The serial position effect of 
free recall, Journal of Experimental Psy- 
chology, 1962, 64, 482-488. 

Nurr, R. H. How to develop a good memory 
for names, faces, and facts. New York: 
Simon & Schuster, 1941. 

Parvio, A. Abstractness, imagery, and mean- 
ingfulness in paired-associate learning. 
Journal of Verbal Learning and Verbal Be- 
havior, 1965, 4, 32-38. 

Perro, C. An evaluation of the effect of 
grammatical structure on learning. Un- 
published master’s thesis, Northwestern 
University, 1965. 

Postman, L, & Scmwanrz, M. Studies of 
learning to learn: I. Transfer as a function 
of method of practice and class of verbal 
materials, Journal of Verbal Learning and 
Verbal Behavior, 1964, 3, 37-49. 

Rorg, D. M. Roth memory course. Santa 
Monica: Motivation, 1961. 

Runquist, W. N., & Farrer, F. H. The use 
of mediators in the learning of verbal 
paired associates. Journal of Verbal Learn- 
ing and Verbal Behavior, 1964, 3, 280-285. 

Ruwquisz, W. N., & MARSHALL, M. A. 
Transfer, synonymity, and anticipatory in- 
terval in paired-associate verbal learning. 
American Journal of Psychology, 1963, 76, 
281-286. 

Russert, W. A., & JexxrNs, J. J. The com- 
plete Minnesota norms for responses to 
100 words from the Kent-Rosanoff Word 
‘Association Test. Technical Report No. 
11, 1954, Contract N8 ONR-66216, Office 
of Naval Research. i 

Scxutz, R. W., & Lovatace, E. A. Mediation 
in verbal paired-associate learning: The 


role of temporal factors. Psychonomic 
Science, 1964, 1, 95-96. 

Scuwenn, E., & Unverwoop, B. J. Simulated 
similarity and mediation time in transfer. 
Journal of Verbal Learning and Verbal 
Behavior, 1965, 4, 476-483. 

Smiru, R, & Nose, C. E. Effects of a 
mnemonic technique applied to verbal 
learning and memory. Perceptual and 
Motor Skills, 1965, 21, 123-134. 

Spear, N. E., Mrxuuxa, P. J., & Popp, M. 
Transfer as a function of time to mediate. 
Journal of Experimental Psychology, 1966, 
72, 40-46. 

Tone, L. E. Warm-up effect as a function 
of level of practice in verbal learning. 
Journal of Experimental Psychology, 1951, 
42, 250-256. 

Turva, E. The effect of alphabetical sub- 
jective organization on memorizing un- 
related words. Canadian Journal of Psy- 
chology, 1962, 16, 185-191. 

Touuvine, E., McNuury, J. A., & Ozme, M. 
Vividness of words and learning to learn 
in free-recall learning. Canadian Journal 
of Psychology, 1965, 19, 242-252. 

Tweot, H. M., & Unverwoon, B. J, Mixed 
vs. unmixed lists in transfer studies. Jour- 
nal of Experimental Psychology, 1959, 58, 
111-116. 

Unverwoon, B. J. Associative transfer in ver- 
bal learning as a function of response 
similarity and degree of first list learning. 
Journal of Experimental Psychology, 1951, 
42, 44-54. 

Unverwoop, B. J., & Ricuarpson, J. Some 
verbal materials for the study of concept 
formation. Psychological Bulletin, 1956, 
53, 84-95, 

Unverwoon, B. J., & Scxutz, R. W. Mean- 
ingfulness and verbal learning. Chicago: 
Lippincott, 1960. 

Wazzace, W. H., Turner, S. H., & PERKINS, 
C. C. Preliminary studies of human infor- 
mation storage. Signal Corps Project No. 
1320, 1957, Institute for Cooperative Re- 
search, University of Pennsylvania. 


APPENDIX A 
TABLE A1 
Pre Worps AND Response WORDS ror EXPERIMENTS I-III 
List I List II List IIT 
E A le 
Peg—Response Peg—Response Peg—Response 
wheel-kitten beet-limousine hailstone-barn 
flea-actor puddle-village bread-revolver 
armor-closet coffee-pillow camel-tack 
pearl-gas bungalow-oyster jellyfish-donkey 
onion-balloon pine-straw city-diamond 
pineapple-blood atom-custard gasoline-flannel 
dime-brick bracelet-ether cranberry-turpentine 
moss-peach cradle-garbage aluminum-flask 
bed-whale canary-buckle cheese-fur 
alley-tomato spinach-doughnut earthworm-vinegar 
dirt-honey lint-bandage dagger-jewel 
germ-spear chestnut-needle barrel-feather 
baseball-cork mouse-tobaceo asparagus-head 
boulder-sewer sardine-belly dome-stadium 
teeth-forest baton-milk fang-sunflower 
dandelion-knuckle moccasin-auditorium grape-lard 
enamel-nutshell ammonia-bicycle hog-gorilla 
armpit-manure olive-lizard beak-grasshopper 
tongue-derby skin-ape carrot-daffodil 


minnow-asphalt 
button-cabbage 
Bki-fire 
lemon-spool- 
fence-diary 
crown-telephone 
salt-chalk 
icicle-crystal 
cave-eel 
snow-mansion 
- rocket-goat 
jug-napkin 
sulphur-pup 
chocolate-pollen 
notebook-pin 
cinnamon-rod 
_hallway-cherry 
hairpin-lantern 
ginger-freckle 
hatchet-grapefruit 
rabbit-apple 


eyeball-knife Silk-banana 


pickle-tweezer 
harpoon-rhinestone 
knob-ivy 
walrus-seaweed 
booze-globe 
elephant-brass 
fishhook-crumb 
ivory-saucer 
frost-padlock 
milk-butter 
adobe-grass 
cucumber-helmet 
heart-drum 
velvet-horse 
stone-monkey 
diaper-coal 
gnat-skunk 
pencil-pony 
thimble-cigar 
hospital-cauliflower 


lawn-money 
Z00-corn 
ice-anchor 
dish-moon 
bacon-scissors 
paste-ocean 
measles-cigarette 
skull-linen 
glue-pill 
egg-night 
pot-bone 
sugar-platter 
sauerkraut-badge 
tunnel-cabin 
rice-collar 
pear-rattlesnake 
radio-bean 
lips-eapsule 
mustard-sheep 
tar-snail 


Ti 
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TABLE A2 
Worp Lists rog ExPERIMENT IV 
List I List M 
api ii onlinetn motali apiposeee qm | Se ee 

Peg Response. Peg Response 
cheese black platter saucer 
window hot silk chamois 
mountain apple ether cigar 
short foot cheese mustard 
eagle sour diamond aluminum 
spider blue cave alley s 
lion smoke anchor boulder 
man tiger paste tar y? 
smooth drink diaper cigarette 
cabbage home bean ivy 
white tall ginger clove 
stem head asphalt coal 
whistle lamb lard butter 
tobacco chair tomato flannel 
bread army rod hallway 
king butter nail ocean 
fruit . queen eucumber straw 
table crackers chestnut freckle 
cold thread button doughnut 
yellow water dagger fishhook 
river train hog garbage 
sheep moth eel lizard 
mutton . food icicle frost 
hammer light auditorium 1 mansion 
whiskey nurse buckle napkin 
cottage web cork balloon 
priest bird city telephone 
sweet rug sheep peach 
butterfly hill honey sugar 
needle rough tweezer armor 
soldier door grapefruit sauerkraut 
house woman velvet kitten 
hand fast baton rattlesnake 
dark round atom erumb 
earth wool puddle gasoline 
carpet flower rabbit ape 
eating church hailstone bone — 
doctor stars crystal turpentine 
slow nail beak needle 

lake barrel cradle 


26 Gorpvon Woop 


APPENDIX B 
Instructions ror ExPERIMENT I 


General Instructions for Experiment I 


This is an experiment on memory for 
words. Three lists of 40 words each will 
be read to you on this tape recorder. The 
words will be read at a rate of one word 
every 5 seconds. Each word will be read 
once; each list will be read once. Following 
the presentation of a list you will be 
asked to write as many of the words as 
you can remember on an answer sheet. As 
nearly as possible write down the words in 
the same order that they were read to you. 
That is, write the first word of each list in 
Blank 1 of your answer sheet, the second 
word in Blank 2, ete. If you do not re- 
member the position of the word in the 
list make a guess as to its Position. Your 
ability to recall the words is of primary 
interest, remembering the position of the 
word in the list is secondary. 

Iam going to pass out some five-page 
handouts. Do not open your handout until 
you are told to do so. Pages 3-5 of your 
handout are answer sheets, Page 3 is to 
be used for the first list, Page 4 for the 
second list, and Page 5 for the third list, 
On the second page of your handout, you 
will find special instructions explaining 
how you are to learn the lists of words 
that will be read to you. The purpose of 
this experiment is to determine the effec- 
tiveness of different strategies for the 
memorizing of words, so it is extremely 
important that you utilize, to the best of 
your ability, the strategy that is explained 
to you on Page 2 of your handout. 

Tf you have any questions, now is the 
time to ask them, However, it will not be 
Possible to answer questions concerning 
the particular strategies to be employed 
because some of you are being asked to 
employ different approaches, Therefore, 
questions concerning one strategy would 
only confuse the users of a different 
strategy. You will have to study Page 2 
until you understand what ig expected of 
you. Since some of the strategies are more 
complicated than others, we will wait 
until everyone assures me they understand 


their approach before starting the tape 
recorder. 


Special Instructions for Group 1 Ex- 
periment I 

Forty words have been written on your 
answer sheet to aid you in the memorizing 
of the words to be read on the tape re- 
corder (tape words). Basically, your task 
is to make a bizarre mental picture using 
each answer-sheet word and each taped 
word. That is, you are to try to form a 
bizarre visual image linking the first taped 
word (ie. the first word read) and the 
first answer-sheet word (ie. the first 
word on your answer sheet) ; then you are 
to try to form a bizarre visual image link- 
ing the second taped word and the second 
answer-sheet word; then you are to try 
to form a bizarre visual image linking the 
third taped word and the third answer- 
sheet word, etc. Thus, after a list of taped 
words has been read, you should have 
formed (ideally) visual images linking each 
of the 40 taped words with one of the 40 
answer-sheet words. 

It is essential that you form a bizarre 
mental picture to link each answer-sheet 
and taped word because bizarre images 
tend to be easier to remember. For ex- 
ample, if automobile was the first word 
printed on your answer sheet and salt- 
shaker was the first taped word, you might 
imagine a huge saltshaker driving an 
automobile. This bizarre mental pieture 
should be rather easy to remember. Also, 
you should make no attempt to rehearse 
previously formed images. 

To recall the list of 40 taped words, 
you are to use the answer-sheet words. 
Each answer-sheet word (e.g., automobile) 
should evoke the bizarre image linking the 
taped word and answer-sheet word (e.g., a 
huge saltshaker driving an automobile), 
thus providing you with the taped word 
(saltshaker). 


Special Instructions for Group 4 Ex- 
periment I 


Forty words have been written on your 
answer sheet to aid you in the memorizing 
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of the words to be read on the tape re- 
corder (taped words). Basically, your task 
is to link each answer-sheet word and each 
taped word by a verbal connection. That 
is, you are to try to verbally link the first 
answer-sheet word (i.e., the first word on 
your answer sheet) and the first taped 
word (i.e. the first word read); then you 
are to try to verbally link the second 
answer-sheet word and the second taped 
word; then you are to try to verbally 
link the third answer-sheet word and the 
third taped word, ete. Thus, after the 
taped words have been read, you should 
have (ideally) each of the 40 taped words 
verbally linked with one of the 40 answer- 
sheet words. 

You are free to link each answer-sheet 
word to a taped word with any verbal 
connection that seems appropriate. For 
example, if you want to link the answer- 
sheet word dog with the taped word nine, 
you may use the word cat to connect the 
two words. Then during recall the answer 
sheet word dog should elicit cat and cat 
should elicit nine. If you want to link the 
answer-sheet word automobile with the 
taped word saltshaker, you might con- 
struct a sentence using saltshaker and 
automobile as key words (e.g. The salt- 
shaker is on the automobile.). Also, you 
may find it unnecessary to provide a ver- 
bal link for some of the answer-sheet and 
taped word pairs, that is, the two words 
may just “seem to go together.” 

To recall the list of 40 taped words, you 
are to use the answer-sheet words. Hach 
answer-sheet word should elicit a taped 
word because you have connected each 
answer-sheet word with a taped word. For 
example, automobile should remind you of 
the sentence, “The saltshaker is on the 
automobile.” and thus provide you with 
the peg word saltshaker. 


Special Instructions for Group 6 Ez- 
periment I 

Basically your task is to make a bizarre 
mental picture to link successive words in 
the list. For example, if automobile, salt- 
shaker, and dog were the first, second, and 
third words respectively of the list, your 
task would be to form a bizarre image to 
link these three words. A possible bizarre 
image might be a huge salishaker driving 
an automobile shaped like a dog. In a sim- 
ilar manner, you are to try to form bizarre 
visual images to link the remaining words 
of the list. The number of words you in- 
clude in any one image is optional. At one 
extreme, you might form 20 images, each 
containing two words; at the other ex- 
treme, you might form one bizarre image 
containing all the words. It is likely that 
forming 8-12 images (3-4 words per 
image) will result in the highest recall. 
Also, it is not necessary to have any word 
represented in two images. That is, if 
your first image links words 1, 2, 3, and 4, 
your second image should link words other 
than words 1, 2, 3, and 4 (e.g. words 5, 
6, and 7). Thus, after the words have been 
read, you should have included (ideally) 
each word in one of your bizarre mental 
pictures. 

To recall the list of 40 words, you need 
recall only the bizarre images that you 
have formed. It is expected that recalling 
bizarre images will be a relatively easy 
task. That is, one is not likely to forget a 
mental picture of a huge saltshaker driv- 
ing an automobile shaped like a dog. 
Special Instructions for Group 7 Ex- 
periment I 

You are free to learn the lists of words 
in whatever way you feel will result in the 
best performance. That is, adopt the strat- 
egy which is best for you. 

(Received November 4, 1966) 


